Summarize dynamic values with Kusto query in Azure Data Explorer - azure-data-explorer

I have this query that almost works:
datatable (timestamp:datetime, value:dynamic)
[
datetime("2021-04-19"), "a",
datetime("2021-04-19"), "b",
datetime("2021-04-20"), 1,
datetime("2021-04-20"), 2,
datetime("2021-04-21"), "b",
datetime("2021-04-22"), 2,
datetime("2021-04-22"), 3,
]
| project timestamp, stringvalue=iif(gettype(value)=="string", tostring(value), ""), numericvalue=iif(gettype(value)=="long", toint(value), int(null))
| summarize any(stringvalue), avg(numericvalue) by bin(timestamp, 1d)
| project timestamp, value=iif(isnan(avg_numericvalue), any_stringvalue, avg_numericvalue)
This splits the values in the value field into stringvalue if the value is string and numericvalue of the value is long. Then it summarizes the values based on day level, for the string values it just takes any value and for the numeric values is calculates the average.
After this I want to put the values back into the value field.
I was thinking that the last row could be like below but the dynamic function only wants literals
| project timestamp, value=iif(isnan(avg_numericvalue), dynamic(any_stringvalue), dynamic(avg_numericvalue))
If I do it like this it will actually work:
| project timestamp, value=iif(isnan(avg_numericvalue), parse_json(any_stringvalue), parse_json(tostring(avg_numericvalue)))
But is there a better way than converting it to json and back?

iff expects the type of the 2nd and 3rd arguments to match. In your case, one is a number, and the other one is a string. To fix the issue, just add tostring() around the number:
datatable (timestamp:datetime, value:dynamic)
[
datetime("2021-04-19"), "a",
datetime("2021-04-19"), "b",
datetime("2021-04-20"), 1,
datetime("2021-04-20"), 2,
datetime("2021-04-21"), "b",
datetime("2021-04-22"), 2,
datetime("2021-04-22"), 3,
]
| project timestamp, stringvalue=iif(gettype(value)=="string", tostring(value), ""), numericvalue=iif(gettype(value)=="long", toint(value), int(null))
| summarize any(stringvalue), avg(numericvalue) by bin(timestamp, 1d)
| project timestamp, value=iif(isnan(avg_numericvalue), any_stringvalue, tostring(avg_numericvalue))

Related

Application Insights project literal value of severityLevel

Is there a way to project literal value of severityLevel in Application Insights query?
Consider following query:
union
customEvents,
dependencies,
exceptions,
performanceCounters,
traces
| order by timestamp desc
| project timestamp, operation_Name, itemType, severityLevel, message = strcat(name, message, outerMessage), customDimensions, ['details']
In the output, severityLevel value is numeric, I want the equivalent descriptive value in according with SeverityLevel Enum definition
I am able to get the severityLevel descriptive value.
Use the below query snippet
union
customEvents,
dependencies,
exceptions,
performanceCounters,
traces
| order by timestamp desc
| project timestamp, operation_Name, itemType, severityLevel, message = strcat(name, message, outerMessage), customDimensions, ['details']
| extend severityLevel = case(severityLevel == 0, "Verbose",
severityLevel == 1, "Information",
severityLevel == 2, "Warning",
severityLevel == 3, "Error",
severityLevel == 4, "Critical",
"-")

How to use where condition in a kusto/appinsight join

I am trying to achieve these things:
Get most recent data for certain fields (base on timestamp) -> call this latestRequest
Get previous data for these fields (basically timestamp < latestRequest.timestamp)-> call this previousRequest
Count the difference between latestRequest and previousRequest
This is what I come with now:
let LatestRequest=requests
| where operation_Name == "SearchServiceFieldMonitor"
| extend Mismatch = split(tostring(customDimensions.IndexerMismatch), " in ")
| extend difference = toint(Mismatch[0])
, field = tostring(Mismatch[1])
, indexer = tostring(Mismatch[2])
, index = tostring(Mismatch[3])
, service = tostring(Mismatch[4])
| summarize MaxTime=todatetime(max(timestamp)) by service,index,indexer;
let previousRequest = requests
| where operation_Name == "SearchServiceFieldMonitor"
| extend Mismatch = split(tostring(customDimensions.IndexerMismatch), " in ")
| extend difference = toint(Mismatch[0])
, field = tostring(Mismatch[1])
, indexer = tostring(Mismatch[2])
, index = tostring(Mismatch[3])
, service = tostring(Mismatch[4])
|join (LatestRequest) on indexer, index,service
|where timestamp <LatestRequest.MaxTime
However, I get this error from this query:
Ensure that expression: LatestRequest.MaxTime is indeed a simple name
I tried to use toDateTime(LatestRequest.MaxTime) but it doesn't make any difference. What I am doing wrong?
The error you get is because you can't refer to a column in a table using the dot notation, you should simply use the column name since the results of a join operator is a table with the applicable columns from both side of the join.
An alternative to join might be using the row_number() and prev() functions. You can find the last record and the one before it by ordering the rows based on the key and timestamp and then calculate the values between the current row and the row before it.
Here is an example:
datatable(timestamp:datetime, requestId:int, val:int)
[datetime(2021-02-20 10:00), 1, 5,
datetime(2021-02-20 11:00), 1, 6,
datetime(2021-02-20 12:00), 1, 8,
datetime(2021-02-20 10:00), 2, 10,
datetime(2021-02-20 11:00), 2, 20,
datetime(2021-02-20 12:00), 2, 30,
datetime(2021-02-20 13:00), 2, 40,
datetime(2021-02-20 13:00), 3, 100
]
| order by requestId asc, timestamp desc
| extend rn = row_number(0, requestId !=prev(requestId))
| where rn <= 1
| order by requestId, rn desc
| extend diff = iif(prev(rn) == 1, val - prev(val), val)
| where rn == 0
| project-away rn
The results are:

How do you access a value in a kusto table by row and column number?

I have a Kusto table counts with 4 rows and 3 columns that has the following elements
HasFailure FunnelPhase count_
0 Experienced 172425
0 NewSubs 25399
1 Experienced 3289
1 NewSubs 643
I would like to access the 3rd element in the 2nd column and save it to a scalar. I have tried the following code:
let value = counts | project count_ lookup 3;
But I am not able to obtain the desired result. What would be the correct way in which to obtain this value?
you'll need to order the records in your table (according to an order you define), then access the 3rd record (according to that same order), and finally - project the specific column you're interested in.
e.g.:
let T =
datatable(HasFailure:bool, FunnelPhase:string, count_:long)
[
0, 'Experienced', 172425,
0, 'NewSubs', 25399,
1, 'Experienced', 3289,
1, 'NewSubs', 643,
]
;
let 3rd_element_in_2nd_column = toscalar(
T
| order by count_ desc
| where row_number() == 3
| project FunnelPhase
)
;
print result = 3rd_element_in_2nd_column

kusto query to show the third column after using distinct for two other columns

Hi I m trying to display ingestedtime in my below Kusto query, can you pls provide suggestion
find withsource=source in (cluster(X).database('y*').['TextFileLogs'])
where AttemptedIngestTime > ago(7d)
and FileLineContent contains "<li>Build Number:"
| distinct source , FileLineContent //, AttemptedIngestTime
| extend databaseName = extract(#"""(oci-[^""]*)""", 1, source)
| extend BuildNumber = extract(#"([A-Z]\w*\.[0-9]\d*\.[0-9]\d*\.[0-9]\d*)",1,FileLineContent)
| extend StampVersion = extract(#"([0-9]\d*\.[0-9]\d*\.[0-9]\d*\.[0-9]\d*)",1,FileLineContent)
| extend cluster = X
//| extend IngestedTime = AttemptedIngestTime
| summarize NumberOfRuns=count() by BuildNumber , StampVersion
you could replace distinct source, FileLineContent with summarize min(AttemptedIngestTime) by source, FileLineContent
or replace min with max, depending on the semantics you want)
then, you'll still need to decide how you aggregate it in your final summarize (either as min(AttemptedIngestTime), or as a group by key, e.g. startofday(AttemptedIngestTime))
regardless, you should consider following query best practices, and:
replace usage of contains with has.
replace usage of extract with parse.

Cassandra - CqlEngine - using collection

I want to know how I can work with collection in cqlengine
I can insert value to list but just one value so I can't append some value to my list
I want to do this:
In CQL3:
UPDATE users
SET top_places = [ 'the shire' ] + top_places WHERE user_id = 'frodo';
In CqlEngine:
connection.setup(['127.0.0.1:9160'])
TestModel.create(id=1,field1 = [2])
this code will add 2 to my list but when I insert new value it replace by old value in list.
The only help in Cqlengine :
https://cqlengine.readthedocs.org/en/latest/topics/columns.html#collection-type-columns
And I want to know that how I can Read collection field by cqlengine.
Is it an dictionary in my django project? how I can use it?!!
Please help.
Thanks
Looking at your example it's a list.
Given a table based on the Cassandra CQL documentation:
CREATE TABLE plays (
id text PRIMARY KEY,
game text,
players int,
scores list<int>
)
You have to declare model like this:
class Plays(Model):
id = columns.Text(primary_key=True)
game = columns.Text()
players = columns.Integer()
scores = columns.List(columns.Integer())
You can create a new entry like this (omitting the code how to connect):
Plays.create(id = '123-afde', game = 'quake', players = 3, scores = [1, 2, 3])
Then to update the list of scores one does:
play = Plays.objects.filter(id = '123-afde').get()
play.scores.append(20) # <- this will add a new entry at the end of the list
play.save() # <- this will propagate the update to Cassandra - don't forget it
Now if you query your data with the CQL client you should see new values:
id | game | players | scores
----------+-------+---------+---------------
123-afde | quake | 3 | [1, 2, 3, 20]
To get the values in python you can simply use an index of an array:
print "Length is %(len)s and 3rd element is %(val)d" %\
{ "len" : len(play.scores), "val": play.scores[2] }

Resources