Aggregations in Application Insights analytics treated as scalars - azure-application-insights

I have this query in application insights analytics
let total = exceptions
| where timestamp >= ago(7d)
| where problemId contains "Microsoft.ServiceBus"
| summarize sum(itemCount);
let nullContext = exceptions
| where timestamp >= ago(7d)
| where problemId contains "Microsoft.ServiceBus"
| where customDimensions.["SpecificTelemetry.Message"] == "HttpContext.Current is null"
| summarize sum(itemCount);
let result = iff(total == nullContext, "same", "different");
result
but I get this error
Invalid relational operator
I am surprised as yesterday with the same code (as far as I remember) I was getting a different error saying that both sides of the check would need to be scalar but my understanding was that the aggregation even if it displays a value (under sum_countItem) it's not a scalar. But couldn't find a way to transform it or now to get rid of this work.
Thanks

Couple of issues.
First - the Invalid relational operator is probably due to the empty lines between your let statements. AI Analytics allows you to write several queries in the same window, and uses empty lines to separate those. So in order to run all the statements as a single query you need to eliminate the empty lines.
Regarding the error of "Left and right side of the relational operator must be scalars" - the result of the "summarize" operator is a table and not scalar. It can contain a single line/column or multiple of those (think of what happens if you add a "by" clause to the summarize).
To achieve what you want to do you might want to use a single query as follows:
exceptions
| where timestamp >= ago(7d)
| where problemId contains "Microsoft.ServiceBus"
| extend nullContext = customDimensions.["SpecificTelemetry.Message"] == "HttpContext.Current is null"
| summarize sum(itemCount) by nullContext

Related

In KQL how can I use bag_unpack to turn a serialized dictionary object in customDimensions into columns?

I'm trying to write a KQL query that will, among other things, display the contents of a serialized dictionary called Tags which has been added to the Application Insights traces table customDimensions column by application logging.
An example of the serialized Tags dictionary is:
{
"Source": "SAP",
"Destination": "TC",
"SAPDeliveryNo": "0012345678",
"PalletID": "(00)312340123456789012(02)21234987654(05)123456(06)1234567890"
}
I'd like to use evaluate bag_unpack(...) to evaluate the JSON and turn the keys into columns. We're likely to add more keys to the dictionary as the project develops and it would be handy not to have to explicitly list every column name in the query.
However, I'm already using project to reduce the number of other columns I display. How can I use both a project statement, to only display some of the other columns, and evaluate bag_unpack(...) to automatically unpack the Tags dictionary into columns?
Or is that not possible?
This is what I have so far, which doesn't work:
traces
| where datetime_part("dayOfYear", timestamp) == datetime_part("dayOfYear", now())
and message has "SendPalletData"
| extend TagsRaw = parse_json(customDimensions.["Tags"])
| evaluate bag_unpack(TagsRaw)
| project timestamp, message, ActionName = customDimensions.["ActionName"], TagsRaw
| order by timestamp desc
When it runs it displays only the columns listed in the project statement (including TagsRaw, so I know the Tags exist in customDimensions).
evaluate bag_unpack(TagsRaw) doesn't automatically add extra columns to the result set unpacked from the Tags in customDimensions.
EDIT: To clarify what I want to achieve, these are the columns I want to output:
timestamp
message
ActionName
TagsRaw
Source
Destination
SAPDeliveryNo
PalletID
EDIT 2: It turned out a major part of my problem was that double quotes within the Tags data are being escaped. While the Tags as viewed in the Azure portal looked like normal JSON, and copied out as normal JSON, when I copied out the whole of a customDimensions record the Tags looked like "Tags": "{\"Source\":\"SAP\",\"Destination\":\"TC\", ... with the double quotes escaped with backslashes.
The accepted answer from David Markovitz handles this situation in the line:
TagsRaw = todynamic(tostring(customDimensions["Tags"]))
A few comments:
When filtering on timestamp, better use the timestamp column As Is, and do the manipulations on the other side of the equation.
When using the has[...] operators, prefer the case-sensitive one (if feasable)
Everything extracted from dynamic value is also dynamic, and when given a dynamic value parse_json() (or its equivalent, todynamic()), simply returns it, As Is.
Therefore, we need to treet customDimensions.["Tags"] in 2 steps:
1st, convert it to string. 2nd, convert the result to dynamic.
To reference a field within a dynamic type you can use X.Y, X["Y"], or "X['Y'].
No need to combine them as you did with customDimensions.["Tags"].
As the bag_unpack plugin doc states:
"The specified input column (Column) is removed."
In other words, TagsRaw does not exist following the bag_unpack operation.
Please note that you can add prefix to the columns generated by bag_unpack. Might make it easier to differentiate them from the rest of the columns.
While you can use project, using project-away is sometimes easier.
// Data sample generation. Not part of the solution.
let traces =
print c1 = "some columns"
,c2 = "we"
,c3 = "don't need"
,timestamp = ago(now()%1d * rand())
,message = "abc SendPalletData xyz"
,customDimensions = dynamic
(
{
"Tags":"{\"Source\":\"SAP\",\"Destination\":\"TC\",\"SAPDeliveryNo\":\"0012345678\",\"PalletID\":\"(00)312340123456789012(02)21234987654(05)123456(06)1234567890\"}"
,"ActionName":"Action1"
}
)
;
// Solution starts here
traces
| where timestamp >= startofday(now())
and message has_cs "SendPalletData"
| extend TagsRaw = todynamic(tostring(customDimensions["Tags"]))
,ActionName = customDimensions.["ActionName"]
| project-away c*
| evaluate bag_unpack(TagsRaw, "TR_")
| order by timestamp desc
timestamp
message
ActionName
TR_Destination
TR_PalletID
TR_SAPDeliveryNo
TR_Source
2022-08-27T04:15:07.9337681Z
abc SendPalletData xyz
Action1
TC
(00)312340123456789012(02)21234987654(05)123456(06)1234567890
0012345678
SAP
Fiddle
If I understand correctly, you want to use project to limit the number of columns that are displayed, but you also want to include all of the unpacked columns from TagsRaw, without naming all of the tags explicitly.
The easiest way to achieve this is to switch the order of your steps, so that you first do the project (including the TagsRaw column) and then you unpack the tags. If desired, you can then use project-away to specifically remove the TagsRaw column after you've unpacked it.

Can't project/extend after `bag_unpack` if empty table

I'm using bag_unpack to explode the customDimensions column in the AppInsights traces table and want to "shape" the resultant table. All is fine if there are rows to work with. If there are not, subsequent operations that reference the exploded columns fail. For example (I boiled it down to an isolated repro),
datatable (Date:datetime, JSON:string )
[datetime(1910-06-11), '{"key": "1"}', datetime(1930-01-01), '{"key": "2"}',
datetime(1953-01-01), '{"key": "3"}', datetime(1997-06-25), '{"key": "4"}']
| where Date > datetime(2000-01-01)
| project parsed = parse_json(JSON)
| evaluate bag_unpack(parsed)
| project-rename value = key
// lots more data shaping here
Since the where filters out all rows, there is nothing to unpack. OK, that's fine but the data shaping ops (e.g., project-rename) fail saying
project-rename: Failed to resolve column reference 'key'
If you change the date in the where to be say 1900-01-01 then everything works as expected.
Note as well that if you remove the bag_unpack the project-rename some other column, it works fine with no rows. For example,
datatable (Date:datetime, JSON:string )
[datetime(1910-06-11), '{"key": "1"}', datetime(1930-01-01), '{"key": "2"}',
datetime(1953-01-01), '{"key": "3"}', datetime(1997-06-25), '{"key": "4"}']
| where Date > datetime(2000-01-01)
| project-rename value = JSON
I can see how the unpack creates the columns so if it didn't run the column doesn't get created but at the same time, why run the project at all if there are no rows?
In theory I could move the where down but I'm not sure if the query planning will recognize that and only do the subsequent project/data shaping on the reduced set of rows (filtered by the where). I've got a lot of rows and typically only need to operate on a few of them.
Pointers on how to work with bag_unpack and empty tables? Or columns that may or may not be there?
You could use the column_ifexists() function: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/columnifexists
For example:
... | project value = column_ifexists("key", "")

Is it possible to distinguish similarly named properties belonging to same vertex in a MATCH query in Nebula Graph?

I'm playing with Nebula Graph 2.0.0.
I create two tags, both contain a property called name:
create tag t1(name string)
create tag t2(name string)
Now I insert a vertex giving it t1.name property:
> insert vertex t1(name) values '1': ('first-name');
Execution succeeded (time spent 2730/95913 us)
I can see that it can be found by ID:
> match (x:t1) where id(x) == '1' return x
+-------------------------------+
| x |
+-------------------------------+
| ("1" :t1{name: "first-name"}) |
+-------------------------------+
Got 1 rows (time spent 2713/95756 us)
I can also find it by ID and t1.name:
> match (x:t1) where id(x) == '1' and x.name == 'first-name' return x
+-------------------------------+
| x |
+-------------------------------+
| ("1" :t1{name: "first-name"}) |
+-------------------------------+
Got 1 rows (time spent 2827/100963 us)
Now I assign t2.name to the same vertex:
insert vertex t2(name) values '1': ('second-name');
It is added ok:
> match (x:t1) where id(x) == '1' return x
+--------------------------------------------------------+
| x |
+--------------------------------------------------------+
| ("1" :t1{name: "first-name"} :t2{name: "second-name"}) |
+--------------------------------------------------------+
Got 1 rows (time spent 3579/96685 us)
And I can still search by ID and t1.name:
> match (x:t1) where id(x) == '1' and x.name == 'first-name' return x
+--------------------------------------------------------+
| x |
+--------------------------------------------------------+
| ("1" :t1{name: "first-name"} :t2{name: "second-name"}) |
+--------------------------------------------------------+
Got 1 rows (time spent 2534/96306 us)
But when I search by ID and t2.name, nothing gets found:
> match (x:t1) where id(x) == '1' and x.name == 'second-name' return x
Empty set (time spent 3383/96788 us)
So, probably, under the hood, the system thinks I want to search by t1.name, so it finds nothing.
In a different session, with tags named differently, I got a different behavior: after adding a second name property, I lost ability to search by first name, but it was possible to search using the second name.
Here are my questions:
Is it even legal to have on same vertex properties with same name (but coming from different tags)?
If so, how can I ask the query executor to use the specific property (for instance, t1.name or t2.name and not the one that the system chooses automatically)?
Good catch! Tag is not that native to OpenCypher actually, we nebula is working work support more and more OpenCypher features, and to make it better.
1 Under the hood, the vertex property matching is based on the nebula index, which is now based on one tag only for now(it's actually left RocksDB key match).
2 when the condition/index matching was done, in MATCH call, you can return all tags' props belongs to it.
Is it even legal to have on same vertex properties with same name (but coming from different tags)?
It's legal. But maybe in most cases, the name could be the same in both tags?
Actually, here the WHERE CLAUSE: where id(x) == '1' is like a Primary Key, thus one Vertex is specified already. In your use case, is it still needed for further AND condition, please?
If so, how can I ask the query executor to use the specific property (for instance, t1.name or t2.name and not the one that the system chooses automatically)?
For the reason: 1(prop index can support only one tag/edge type) now, we can only filter within one tag, thus it's either t1.name or t2.name, like t1.name == "foo" and t1.age > 18.
The interesting case you found(👍🏻) is because the prop-index-selection(which is rule based selection, for now, there will be more improvements on this) cannot be done due to id(v) == 123 is not prop-index based and both tags' prop-indexes are of "name".
To avoid this, maybe we shouldn't allow the same prop-name across tags that belong to the same vertex with different values, this could be improved from docs or code.

Kusto query max x by y

I'm trying write a simple Kusto query to find the max value of x for each y. To be more specific, I'm querying the Azure Data Explorer sample table "Covid", trying to get the max number of deaths by country. I tried using a few things like this
Covid19
| summarize MostDeaths = max(Deaths) by Country
| project Country, MostDeaths
But this is not working of course. I also haven't found anything like this (simple as it is) in the documentation with its examples.
Edit: Expected results:
Actual results: "A recognition error occurred.
Token: |
Line: 3, Position: 0"
Your query, as you've written it in your question - is valid (syntactically and semantically).
You can click this deep-link to run it and see for yourself: https://dataexplorer.azure.com/clusters/help/databases/Samples?query=H4sIAAAAAAAAA3POL8tMMbTkqlEoLs3NTSzKrEpV8M0vLnFJTSzJKFawVchNrNCAcDQVkioVnPNL80qKKoHqC4rys1KTS2AiOkjaACLGJoNVAAAA
Given the error message you're seeing, I can guess that you're actually running a different query.
(perhaps:
do you have a duplicate pipe (|) ?
or - are you missing a linebreak between multiple queries in the same query editor tab?
)

How to update entries in a table within a nested dictionary?

I am trying to create an order book data structure where a top level dictionary holds 3 basic order types, each of those types has a bid and ask side and each of the sides has a list of tables, one for each ticker. For example, if I want to retrieve all the ask orders of type1 for Google stock, I'd call book[`orderType1][`ask][`GOOG]. I implemented that using the following:
bookTemplate: ([]orderID:`int$();date:"d"$();time:`time$();sym:`$();side:`$();
orderType:`$();price:`float$();quantity:`int$());
bookDict:(1#`)!enlist`orderID xkey bookTemplate;
book: `orderType1`orderType2`orderType3 ! (3# enlist(`ask`bid!(2# enlist bookDict)));
Data retrieval using book[`orderType1][`ask][`ticker] seems to be working fine. The problem appears when I try to add new order to a specific order book e.g:
testorder:`orderID`date`time`sym`side`orderType`price`quantity!(111111111;.z.D;.z.T;
`GOOG;`ask;`orderType1;100.0f;123);
book[`orderType1][`ask][`GOOG],:testorder;
Executing the last query gives 'assign error. What's the reason? How to solve it?
A couple of issues here. First one being that while you can lookup into dictionaries using a series of in-line repeated keys, i.e.
q)book[`orderType1][`ask][`GOOG]
orderID| date time sym side orderType price quantity
-------| -------------------------------------------
you can't assign values like this (can only assign at one level deep). The better approach is to use dot-indexing (and dot-amend to reassign values). However, the problem is that the value of your book dictionary is getting flattened to a table due to the list of dictionaries being uniform. So this fails:
q)book . `orderType1`ask`GOOG
'rank
You can see how it got flattened by inspecting the terminal
q)book
| ask
----------| -----------------------------------------------------------------
orderType1| (,`)!,(+(,`orderID)!,`int$())!+`date`time`sym`side`orderType`pric
orderType2| (,`)!,(+(,`orderID)!,`int$())!+`date`time`sym`side`orderType`pric
orderType3| (,`)!,(+(,`orderID)!,`int$())!+`date`time`sym`side`orderType`pric
To prevent this flattening you can force the value to be a mixed list by adding a generic null
q)book: ``orderType1`orderType2`orderType3 !(::),(3# enlist(`ask`bid!(2# enlist bookDict)));
Then it looks like this:
q)book
| ::
orderType1| `ask`bid!+(,`)!,((+(,`orderID)!,`int$())!+`date`time`sym`side`ord
orderType2| `ask`bid!+(,`)!,((+(,`orderID)!,`int$())!+`date`time`sym`side`ord
orderType3| `ask`bid!+(,`)!,((+(,`orderID)!,`int$())!+`date`time`sym`side`ord
Dot-indexing now works:
q)book . `orderType1`ask`GOOG
orderID| date time sym side orderType price quantity
-------| -------------------------------------------
which means that dot-amend will now work too
q).[`book;`orderType1`ask`GOOG;,;testorder]
`book
q)book
| ::
orderType1| `ask`bid!+``GOOG!(((+(,`orderID)!,`int$())!+`date`time`sym`side`o
orderType2| `ask`bid!+(,`)!,((+(,`orderID)!,`int$())!+`date`time`sym`side`ord
orderType3| `ask`bid!+(,`)!,((+(,`orderID)!,`int$())!+`date`time`sym`side`ord
Finally, I would recommend reading this FD whitepaper on how to best store book data: http://www.firstderivatives.com/downloads/q_for_Gods_Nov_2012.pdf

Resources