I'm trying to make a table with these columns
type | count
I tried this with no luck
exceptions
| where timestamp > ago(144h)
| extend
type = type, count = summarize count() by type
| limit 100
Any idea on what I'm doing wrong?
You should do this instead:
exceptions
| where timestamp > ago(144h)
| summarize count = count() by type
| limit 100
Explanation:
You should use extend when you want to add new/replace columns to the result, for example, extend day_of_month = dayofmonth(Timestamp) - you'll remain with exactly the same record count in this case - see more info in the doc
You should use summarize when you want to summarize multiple records (so the record count after the summarize will usually be smaller than the original record count), like in your case - see more info in the doc
By the way, instead of 144h you can use 6d, which is exactly the same, but is more natural to the human eye :)
Related
I am trying to make a line graph in Azure Data Explorer, I have a single table and I am trying to make each line in the line graph be based on a column in that table.
Using a single column works just fine with the below query
scandevicedata
| where SuccessFullScan == 1
| summarize SuccessFulScans = count() by scans = bin(todatetime(TransactionTimeStampUtc), 30s)
The problem is now I want to add a second column from the same table like this
scandevicedata
| where UnSuccessFullScan == 1
| summarize UnSuccessFulScans = count() by scans = bin(todatetime(TransactionTimeStampUtc), 30s)
As you can see the first query takes out successful scans and the second query takes out Un Successful scans and now I want to combine them in the same output and do a graph on them but I cant figure out how to do this since they are different columns
How can I achieve this?
You could use the countif() aggregation function :
scandevicedata
| summarize
Successful = countif(SuccessFullScan == 1),
Unsuccessful = countif(UnsuccessFullScan == 1)
by bin(todatetime(TransactionTimeStampUtc), 30s)
| render timechart
How do I write my query to create the data result in the proper format to be plotted in multiple panels using the | render timechart with (ysplit=panels) output?
Looking at Microsoft's examples, I need to have my IPPrefix column to produce multiple columns in a single row. Instead, my query is producing separate rows for each grouping in IPPrefix.
I have the following query:
let startTime = datetime('2020.07.23 20:00:00');
let endTime = datetime('2020.07.23 23:59:00');
AzureDiagnostics
| where TimeGenerated between (startTime..endTime)
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayAccess"
| where requestUri_s contains "api/auth/ping"
| extend IPParts = split(clientIP_s, '.')
| extend IPPrefix = strcat(IPParts[0], '.', IPParts[1], '.', IPParts[2])
| make-series Count = count() on TimeGenerated in range(startTime, endTime, 5m) by IPPrefix
//| summarize AggregatedValue = count() by IPPrefix, bin(TimeGenerated, 1m)
| render timechart with (ysplit=panels)
I want the result to look something like:
But instead, all the y-series are plotted in a single panel like:
I suppose that I am not using make-series in the correct way in order to produce the result I need but I have not been able to apply it in a different way to make it work.
I realized that I needed to pivot on the data before rendering. I also learned there is a limit of 5 panels on the ysplit=panels option. I had to limit the series to five and then perform a pivot on the aggregated data.
...
| make-series Count = count() on TimeGenerated in range(startTime, endTime, 1m) by IPPrefix
| take 5
| evaluate pivot(IPPrefix, any(Count), TimeGenerated)
| render timechart with(ysplit=panels)
Resulting chart with five panels.
I've got a simple KQL query that plots the (log of the) count of all exceptions over 90 days:
exceptions
| where timestamp > ago(90d)
| summarize log(count()) by bin(timestamp, 1d)
| render timechart
What I'd like to do is add some reference lines to the timechart this generates. Based on the docs, this is pretty straightforward:
| extend ReferenceLine = 8
The complicating factor is that I'd like these reference lines to be based on aggregations of the value I'm plotting. For instance, I'd like a reference line for the minimum, mean, and 3rd quartile values.
Focusing on the first of these (minimum), it turns out that you can't use min() outside of summarize(). But I can use this within an extend().
I was drawn to min_of(), but this expects a list of arguments instead of a column. I'm thinking I could probably expand the column into a series of values, but this feels hacky and would fall down beyond a certain number of values.
What's the idiomatic way of doing this?
you could try something like the following:
exceptions
| where timestamp > ago(90d)
| summarize c = log(count()) by bin(timestamp, 1d)
| as hint.materialized=true T
| extend _min = toscalar(T | summarize min(c)),
_perc_50 = toscalar(T | summarize percentile(c, 50))
| render timechart
I want to show a graph of minimum value, maximum value and difference between maximum and minimum for each timeslice.
It works ok for min and max
| parse "FromPosition *)" as FromPosition
| timeslice 2h
| max(FromPosition) ,min(FromPosition) group by _timeslice
but I couldn't find the correct way to specify the difference.
e.g.
| (max(FromPosition)- min(FromPosition)) as diffFromPosition by _timeslice
returns error -Unexpected token 'b' found.
I've tried a few different combinations to declare them on different lines as suggested on https://help.sumologic.com/05Search/Search-Query-Language/aaGroup. e.g.
| int(FromPosition) as intFromPosition
| max(intFromPosition) as maxFromPosition , min(intFromPosition) as minFromPosition
| (maxFromPosition - minFromPosition) as diffFromPosition
| diffFromPosition by _timeslice
without success.
Can anyone suggest the correct syntax?
Try this:
| parse "FromPosition *)" as FromPosition
| timeslice 2h
| max(FromPosition), min(FromPosition) by _timeslice
| _max - _min as diffFromPosition
| fields _timeslice, diffFromPosition
the group by is for the min and max functions to know what range to work with, not the group by for the overall search query. That's why you were getting the syntax errors and one reason I prefer to just use by as above.
For these kinds of queries I usually prefer a box plot where you would just do:
| min(FromPosition), pct(FromPosition, 25), pct(FromPosition, 50), pct(FromPosition, 75), max(FromPosition) by _timeslice
Then selecting box plot as the graph type. Looks great on a dashboard and provides a lot of detailed information about deviation and such at a glance.
I run a simulation with varying number of iterations and each iteration creates an output table like table1,table2,table3... They all have the same structure like:
ID | value
but varying number of rows.
For each table, I want to compute the average of the 'value' column and show them in a new table with the column "averages" like:
tableNumber | averageValue
1 | 516
2 | 512
3 | 521
... | ...
Is this possible in SQlite if the number of tables is quite high? And if not, how can I achieve this in a different way?
Thanks a lot in advance :-)
Instead of creating different tables, put the results in the same table, and have a column which indicates which batch or set the row belongs to. Then when you query the table you can filter on that column so that you're working only with the desired batch/set. Put an index on that column to improve the efficiency of the query and make it run faster. There will be no need to save the average results to separate tables either. Your query can produce the result without your having to persist the results as data in another table.
select batch, avg(value) as AvgValue
from simulation
where batch = 100
group by batch