Kusto : Summarize count by hours of the day (hours in column) - azure-data-explorer

I have a list of metrics that I want to visualize by name (row) and count by hours of the current day (column)
The example below create a row by Hour and metric name
customMetrics
| extend hour= floor( timestamp % 1d , 1h)
| where name contains "WebServiceCall-"
| summarize event_count = sum(value) by hour, name
I want the data display like this:
MetricName | Count Hour0 | Count Hour2 | Count Hour3 | ... | Count Hour24
Is it possible to do it with Kusto?

Yes, you can use the pivot plugin for this.

Thanks you Avnera
customMetrics
| where name contains "WebServiceCall-"
| extend Hour= floor( timestamp % 1d , 1h)
| project name, Hour, value
| evaluate pivot(Hour, sum(value))

Related

Calculate Count of users every month in Kusto query language

I have a table named tab1:
Timestamp Username. sessionid
12-12-2020. Ravi. abc123
12-12-2020. Hari. oipio878
12-12-2020. Ravi. ytut987
11-12-2020. Ram. def123
10-12-2020. Ravi. jhgj54
10-12-2020. Shiv. qwee090
10-12-2020. bob. rtet4535
30-12-2020. sita. jgjye56
I want to count the number of distinct Usernames per day, so that the output would be:
day. count
10-12-2020. 3
11-12-2020. 1
12-12-2020. 2
30-12-2020. 1
Tried query:
tab1
| where timestamp > datetime(01-08-2020)
| range timestamp from datetime(01-08-2020) to now() step 1d
| extend day = dayofmonth(timestamp)
| distinct Username
| count
| project day, count
To get a very close estimation of the number of Usernames per day, just run this (the number won't be accurate, see details here):
tab1
| summarize dcount(Username) by bin(Timestamp, 1d)
If you want accurate results, then you should do this (just note that the query will be less performant than the previous one, and will only work if you have up to 1,000,000 usernames / day):
tab1
| summarize make_set(Username) by bin(Timestamp, 1d)
| project Timestamp, Count = array_length(set_Username)

computing offset for prev dynamically

I want to set offset for prev dynamically, based on number of items in a group. for e.g
T
| make-series value = sum(value) on timestamp from .. to .. step 5m by customer
| summarize by bin(timestamp,1h), customer
| extend prev_value = prev(value,<offset>)
The offset here should be equal to number of distinct customers. How can i compute this offset dynamically
If you can split query into small parts, you can use toscalar function to get number of unique customers.
This would be my approach...
let tab_series =
T
| make-series value = sum(value) on timestamp from .. to .. step 5m by customer
;
let no_of_distinct_customers =
toscalar(tab_series | distinct customer | summarize count())
;
tab_series
| summarize by bin(timestamp, 1h), customer
| extend prev_value = prev(value, no_of_distinct_customers)
You can find example here.

Use values from one table in the bin operator of another table

Consider the following query:
This will generate a 1 cell result for a fixed value of bin_duration:
events
| summarize count() by id, bin(time , bin_duration) | count
I wish to generate a table with variable values of bin_duration.
bin_duration will take values from the following table:
range bin_duration from 0 to 600 step 10;
So that the final table looks something like this:
How do I go about achieving this?
Thanks
The bin(value,roundTo) aka floor(value,roundTo), will round value down to the nearest multiple of roundTo, so you don't need an external table.
events
| summarize n = count() by bin(duration,10)
| where duration between(0 .. 600)
| order by duration asc
You can try this out on the Stormevents tutorial:
let events = StormEvents | extend duration = (EndTime - StartTime) / 1h;
events
| summarize n = count() by bin(duration,10)
| where duration between(0 .. 600)
| order by duration asc
When dealing with timeseries data, bin() also understands the handy timespan literals, ex.:
let events = StormEvents | extend duration = (EndTime - StartTime);
events
| summarize n = count() by bin(duration,10h)
| where duration between(0h .. 600h)
| order by duration asc

Order of columns after pivot in application insights

User wants a count of unique sessions per week in application insights. I have the query working, including a pivot, but the Week columns are out of order. I would prefer if they were in order.
pageViews
| where timestamp < now()
| summarize Sessions= dcount(session_Id)
by Week=bin(datepart("weekOfYear", timestamp), 1), user_AuthenticatedId
| order by Week
| evaluate pivot(Week, sum(Sessions))
| join kind=innerunique (pageViews
| summarize MostRecentRequest = max(timestamp) by user_AuthenticatedId)
on $right.user_AuthenticatedId == $left.user_AuthenticatedId
| project-away user_AuthenticatedId1
I've tried ordering by timestamp before the summarize, and ordering by week after the summarize (still in there) and no luck.
There's currently a "trick" that will work: serialize right after your order by
pageViews
| where timestamp < now()
| where isnotempty(user_AuthenticatedId)
| summarize Sessions= dcount(session_Id)
by Week=bin(datepart("weekOfYear", timestamp), 1), user_AuthenticatedId
| order by Week
| serialize // <--------------------------------- RIGHT HERE
| evaluate pivot(Week, sum(Sessions))
| join kind=innerunique (pageViews
| summarize TotalSessions=dcount(session_Id), MostRecentRequest = max(timestamp) by user_AuthenticatedId)
on $right.user_AuthenticatedId == $left.user_AuthenticatedId
| project-away user_AuthenticatedId1
| top 100 by TotalSessions desc
gets me this in workbooks, with the weeks in descending order (I also added total session count to sort/top by with some custom column settings set):
the custom settings I have for the column settings in workbooks:
delete all the #'d columns that are there by default and add one for ^[0-9]+$ set to heatmap:
I refactored query a bit for my own comprehension. I took the the left and right into "views". Thought I'd share.
let users_MostRecent_Session =
pageViews
| summarize
TotalSessions=dcount(session_Id)
, MostRecentRequest = max(timestamp)
by
user_AuthenticatedId
;
//
let users_sessions_ByWeek =
pageViews
| where timestamp < now()
| where isnotempty(user_AuthenticatedId)
| summarize
Sessions= dcount(session_Id)
by
Week=bin(datepart("weekOfYear", timestamp), 1)
, user_AuthenticatedId
| order by Week
| serialize
| evaluate pivot(Week, sum(Sessions))
;
//
//
users_sessions_ByWeek
| join kind=innerunique
users_MostRecent_Session
on user_AuthenticatedId
| project-away user_AuthenticatedId1
| top 100 by TotalSessions desc

sqlite get data from one field on two different dates and present as two columns

I have a sqlite database with some time series data:
holdings:
| id | date | instrument | position | price | portfolio | sector |
prices:
| id | date | instrument | open | high | low | close | adjclose |
static_data
| id | ticker | name | sector | industry | country | currency |
and I'd like to get the holdings for a particular day with the change in price on that day as a calculated field.
I've tried the following query
SELECT h.date,
h.portfolio,
h.instrument,
s.name,
h.position,
p.adjclose AS curpx,
(p.adjclose AS lastpx WHERE
h.date = "2013-01-10 00:00:00" AND
h.instrument = p.instrument)
FROM holdings AS h,
static_data AS s,
prices AS p
WHERE h.date = "2013-01-11 00:00:00"
AND h.portfolio = "usequity"
AND (h.instrument = p.instrument)
AND (h.date = p.date)
AND (h.instrument = s.ticker);
but I get a syntax error.
[2014-11-14 06:11:04] [1] [SQLITE_ERROR] SQL error or missing database (near "as": syntax error)
I'm a complete N00b at SQL so I'd like to know how I can get two sets of data from the same table and show them side by side or to perform a calculation using one against the other in SQL.
Thanks
You want a correlated subquery:
SELECT ...,
p.adjclose AS curpx,
(SELECT p2.adjclose
FROM prices AS p2
WHERE p2.date = datetime(h.date, '-1 days')
AND p2.instrument = h.instrument
) AS lastpx
FROM ...

Resources