using KQL to Identify the absence of a value within a JSON message - azure-data-explorer

I am storing JSON messages within an ADX table. The datatype of the JSON column is a string. Within the JSON message is an array that looks like this
"FilingEntities": [
{
"FilingEntity": 0,
"FilingMethod": 1,
"FilingSubMethod": -1
},
{
"FilingEntity": 29,
"FilingMethod": 1,
"FilingSubMethod": -1
},
{
"FilingEntity": 66,
"FilingMethod": 2,
"FilingSubMethod": -1
}
]
what I am trying to do is write a query that identifies the messages where there is only one instance of a filing array. For example, it looks like this
"FilingEntities": [
{
"FilingEntity": 0,
"FilingMethod": 1,
"FilingSubMethod": -1
}
]
So far I have been trying to just get the JSON parsed using
EventReceivedRaw
| extend DynamicJson = todynamic(JSONRaw)
| mv-expand DynamicJson
| project UniqueEventGuid, TimeStampInCST, DynamicJson, JSONRaw
but can't really figure out how to interrogate the message to get to the result I am looking for.

The datatype of the JSON column is a string
for efficiency, you should strongly consider re-typing this column to be dynamic, so that you don't have to do query-time parsing.
what I am trying to do is write a query that identifies the messages where there is only one instance of a filing array
you could use the array_length() function.
for example:
EventReceivedRaw
| extend DynamicJson = todynamic(JSONRaw)
| where array_length(DynamicJson.FilingEntities) == 1
| project UniqueEventGuid, TimeStampInCST, DynamicJson, JSONRaw

Related

Kusto extractjson not working with email address

I am attempting to use the extractjson() method that includes email addresses in the source data (specifically the # symbol).
let T = datatable(MyString:string)
[
'{"user#domain.com": {"value":10}, "userdomain.com": { "value": 5}}'
];
T
| project extractjson('$.["user#domain.com"].value', MyString)
This results in a null being returned, changing the JSONPath to '$.["userdomain.com"].value' does return the correct result.
Results
I know the # sign is a used as the current node in a filter expression, does this need to be escaped when used with KQL?
Just as a side note, I run the same test using nodes 'jsonpath' package and this worked as expected.
const jp = require('jsonpath');
const data = {"user#domain.com": {"value":10}, "name2": { "value": 5}};
console.log(jp.query(data, '$["user#domain.com"].score'));
you can use the parse_json() function instead, and when you don't have to use extract_json():
print MyString = '{"user#domain.com": {"value":10}, "userdomain.com": { "value": 5}}'
| project parse_json(MyString)["user#domain.com"].value
MyString_user#domain.com_value
10
From the documentation: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/extractjsonfunction

Azure Data Explorer: how to update a table from a raw JSON event using Kusto and update policies?

I ingest raw telemetry data as JSON records into a single-column table called RawEvents, in the column called Event. This is what a record/event looks like:
{
"clientId": "myclient1",
"IotHubDeviceId": "myiothubdevice1",
"deviceId": "mydevice1",
"timestamp": "2022-04-12T10:29:00.123",
"telemetry": [
{
"telemetryId: "total.power"
"value": 123.456
},
{
"telemetryId: "temperature"
"value": 34.56
},
...
]
}
The RawEvents table is created and set up like this:
.create table RawEvents (Event: dynamic)
.create table RawEvents ingestion json mapping 'MyRawEventMapping' '[{"column":"Event","Properties":{"path":"$"}}]'
There is also the Telemetry table that will be used for queries and analysis. The Telemetry table has the strongly-typed columns that match raw data record structure from the RawEvents table. It gets created like this:
.create table Telemetry (ClientId:string, IotHubDeviceId:string, DeviceId:string, Timestamp:datetime, TelemetryId:string, Value: real)
In order to get Telemetry table updated with records whenever a new raw event gets ingested into RawEvents, I have tried to define a data transformation function and to use that function inside an update policy which would be attached to the Telemetry table.
To that end, I have used the following script to verify that my data transformation logic works as expected:
datatable (event:dynamic)
[
dynamic(
{
"clientId": "myclient1",
"IotHubDeviceId": "myiothubdevice1",
"deviceId": "mydevice1",
"timestamp": "2022-04-12T10:29:00.123",
"telemetry": [
{
"telemetryId": "total.power",
"value": 123.456
},
{
"telemetryId": "temperature",
"value": 34.56
}
]
}
)
]
| evaluate bag_unpack(event)
| mv-expand telemetry
| evaluate bag_unpack(telemetry)
Executing that script gives me the desired output which matches the Telemetry table structure:
clientId deviceId IotHubDeviceId timestamp telemetryId value
myclient1 mydevice1 myiothubdevice1 2022-04-12T10:29:00.123Z total.power 123.456
myclient1 mydevice1 myiothubdevice1 2022-04-12T10:29:00.123Z temperature 34.56
Next, I have created a function called ExpandTelemetryEvent which contains that same data transformation logic applied to RawEvents.Event:
.create function ExpandTelemetryEvent() {
RawEvents
| evaluate bag_unpack(Event)
| mv-expand telemetry
| evaluate bag_unpack(telemetry)
}
And as a final step, I have tried to create an update policy for the Telemetry table which would use RawEvents as a source and ExpandTelemetryEvent() as the transformation function:
.alter table Telemetry policy update #'[{"Source": "RawEvents", "Query": "ExpandTelemetryEvent()", "IsEnabled": "True"}]'
This is where I got the error message saying
Error during execution of a policy operation: Caught exception while validating query for Update Policy: 'IsEnabled = 'True', Source = 'RawEvents', Query = 'ExpandTelemetryEvent()', IsTransactional = 'False', PropagateIngestionProperties = 'False''. Exception: Request is invalid and cannot be processed: Semantic error: SEM0100: 'mvexpand' operator: Failed to resolve scalar expression named 'telemetry'
I sort of understand why the policy cannot be applied. With the sample script, the data transformation worked because there was enough information to infer what the telemetry is, whereas in this case there is nothing in the RawEvents.Event which would provide the information about the structure of the raw events which will be stored in the Event column.
How can this be solved? Is this the right approach at all?
As the bag_unpack plugin documentation indicates:
The plugin's output schema depends on the data values, making it as "unpredictable" as the data itself. Multiple executions of the plugin, using different data inputs, may produce different output schema.
Use well-defined transformation instead
RawEvent
| project clientId = event.clientId, deviceId = event.deviceId, IotHubDeviceId = event.IotHubDeviceId, timestamp = event.timestamp, event.telemetry
| mv-expand event_telemetry
| extend telemetryId = event_telemetry.telemetryId, value = event_telemetry.value
| project-away event_telemetry

App insights: Can you concatenate two properties together?

I have a custom event with a json (string) property called EventInfo. Sometimes this property will be larger than the 150 character limit set on event properties, so I have to split it into multiple properties, ie EventInfo0, EventInfo1, ect.
For example (shortened for simplicity)
EventInfo0: [{ "label" : "likeButton", "stat],
EventInfo1: [us" : "success" }]
I found out how to look at EventInfo as a json in app insights like:
customEvents
| where name == "people"
| extend Properties = todynamic(tostring(customDimensions.Properties))
| extend type=parsejson(Properties.['EventInfo'])
| mvexpand type
| project type, type.label, type.status]
Is there a way I can concatenate EventInfo0 and EventInfo1 to create the full json string, and query that like above?
According to the documentation, the 150 character limit is on the key, and not on the entire payload. So splitting as you're doing it may not actually be required.
https://learn.microsoft.com/en-us/azure/azure-monitor/app/data-model-event-telemetry#custom-properties
that said, to answer your questions - while it's not efficient to do this at query time, the following could work:
datatable(ei0:string, ei1:string)
[
'[{ "label" : "likeButton", "stat]', '[us" : "success" }]',
'[{ "lab]', '[el" : "bar", "hello": "world" }]'
]
| project properties = parse_json(strcat(substring(ei0, 1, strlen(ei0) - 2), substring(ei1, 1, strlen(ei1) - 2)))
| project properties.label
properties_label
----------------
likeButton
bar

Remove sections of a string or parse data?

I have a long string of data and I'm trying to pick out one small piece of it. The position in the string changes all the time. A sample of the data is below. I have researched strip and parse, but I'm thinking strip is the wrong choice, but parse might do it.
Data: {'DriverCarSLFirstRPM': 6000.0, 'DriverCarFuelMaxLtr': 44.987, 'DriverCarMaxFuelPct': 0.3, 'Drivers': [{'CarIsAI': 0, 'LicSubLevel': 1, 'TeamID': 0}
I'm trying to get the value for DriverCarFuelMaxLtr. Should I be trying to strip the data before and after that value, or is there a way to seperate the file at the commas and then read the values?
Your JSON was invalid to start with.
This has now been validated:
{
"DriverCarSLFirstRPM": 6000.0,
"DriverCarFuelMaxLtr": 44.987,
"DriverCarMaxFuelPct": 0.3,
"Drivers": [{
"CarIsAI": 0,
"LicSubLevel": 1,
"TeamID": 0
}]
}
This should help you get started:
import json
data = '{"DriverCarSLFirstRPM": 6000.0, "DriverCarFuelMaxLtr": 44.987,"DriverCarMaxFuelPct": 0.3,"Drivers": [{"CarIsAI": 0,"LicSubLevel": 1,"TeamID": 0}]}'
info = json.loads(data)
# To get the value of "DriverCarFuelMaxLtr"
print(info.get('DriverCarFuelMaxLtr'))
Output: 44.987
It looks like you have a stringified json object. You could look into Json.loads().
Or if I am mistaken and for some reason you can not transform it into a dictionary I would just use regex (regular expressions).
If you don't know about regex you can look at this guide and learn how to use them in python.

Need to generate an json array, then loop through the values

I need to create some sort of a state for a bunch of elements on a page.
The stats can be 1 or -1.
Now on the server side I will generate a JSON array and put it in my .aspx page like this:
var someArray = { 100:-1, 1001:1, 102:1, 103:-1 }
How do I loop through each value now in javascript?
BTW, is my JSON array format correct?
Note that someArray is a misnomer as it is actually an Object. To loop through it, though:
for(key in someArray) {
alert(someArray[key]);
}
As far as whether it is valid, the above works for me but I believe technically keys should be strings:
{
"100": -1,
"1001": 1,
"102": 1,
"103": -1
}
Check out this handy JSON validator.

Resources