Analytics data deleted from the Azure applicationinsights - azure-application-insights

I checked that when i try to query on events data the total record decreased automatically. I want to know that when data is deleted from applicationinsights. My query is just simple count query. I also noticed that some events timestamp also updated. I searched for blog etc but nothing found
customEvents | count

Unless you explicitly set a time range in the query directly, like
customEvents | where timestamp >= ago(7d) | count
then the analytics website is going to automatically apply a last 24 hours time restriction automatically. (look in the upper right corner, there is a dropdown that will either say "set in query" or "last 24 hours" or other choices)

Related

What is the best strategy for optimizing aggregation queries over a long period of time/dataset in Azure Data Explorer (Kusto)?

I have some queries that look at aggregated data over a long period of time (180 days) for data that is per second (example query below). The table's hot cache is 31 days so the queries can take over a minute to return and this is not acceptable for the dashboards I want to display them on. What would be recommended optimization strategies? My thoughts so far is to either use an update policy to push the data for these tags into a separate table with a hot cache of 180 days or to use a materialized view.
raw_table
| where TimeStamp between (now(-180d) .. now()) and TagName in ("Tag1","Tag2")
| extend Date = startofday(TimeStamp)
| summarize Value1=max(Value) by Date,TagName
| summarize Value1=sum(Value1) by Date
| project TagName="AggregatedData",Date,Value
My thoughts so far is to either use an update policy to push the data for these tags into a separate table with a hot cache of 180 days or to use a materialized view.
both options you mentioned are appropriate (even a combination of both, if required)

Question using distinct and grouping on WIndows Virtual Desktop session statistics

I need some help creating a Kusto query for filtering and grouping Windows Virtual Desktop statistics.
What I need: A chart that shows me the amount of total sessions of users logged in to a WVD Host Pool.
Data available: WVD logs this info every x seconds to log analytics for each host. But that interval is not exactly "every x seconds", but at least once every 3 minutes.
So I made this query for now:
WVDAgentHealthStatus
| where TimeGenerated > ago(3m)
| project SessionHostName, TimeGenerated, ActiveSessions, InactiveSessions, Totalsessions=(toint(ActiveSessions) + toint(InactiveSessions))
That makes results like this:
As you can see, some hosts are reported twice, some three times.
I need help with:
How to make this query usable in a chart, so that it shows "the amount of sessions per host" in steps of every 3 minutes, but the chart should show this data for the last 8 hours.
I guess "where TimeGenerated" has to be "> ago(8h)", but it needs to group the data in sets of 3 minutes, then get distinct data per host.
I have no idea how to do this. I'm not that good in Kusto. Can anyone help me out?
You need to use the summarize operator:
WVDAgentHealthStatus
| where TimeGenerated > ago(8h)
| summarize arg_max(TimeGenerated, ActiveSessions) by bin(TimeGenerated, 3m), SessionHostName

Kibana KQL search saved with time range

I'm creating a saved search using Kibana Discover. On screen I can select the time range manually (default is "Last 15 minutes"). If I select "Last 24 hours" and Refresh the search, it works fine. However, when I save the search, the time range information is not saved. Whenever I open Kibana Discover and open my saved search, the time range is always the default (last 15 minutes).
I tried adding different conditions (below) to my KQL query, to no avail. :^(
and #timestamp >= "now-24h"
and #timestamp >= now-24h
and timestamp >= "now-24h"
and #timestamp >= "now-1d/d"
etc.
I also tried to "Add filter" using the #timestamp field. The only operators available are: is not, is one of, is not one of, is between, is not between, exists, does not exist.
So, I tried between now-24h and now. Yet, no results match the criteria.
However, if I manually change the time range to say, Last 30 days, then my filter works!
Question
How do I include the time range, say last 24 hours, in my KQL saved search so that it overrides the time range that is manually set on screen?
(Motivation: I don't want to change the default manual time range to be larger than the range used in all saved searches. Also, I want to make the saved searches available to a support team with minimal manual configuration.)
(I'm using Kibana v7.10.0.)

BigQuery to Data Studio : Show reliable COUNT DISTINCT regardless of the selected period

in my BigQuery project I store event data integrated from Firebase. The granularity and dimension is such that trying to present raw data in Data Studio quickly makes the report become VERY slow (1-2 min per page/interaction).
I then started to think how I could create pre-aggregated tables in BigQuery to speed everything up, but quickly realised COUNT DISTINCT metrics would be a problem with this approach.
Let me explain:
SELECT user, date
FROM UNNEST([
STRUCT("Adam" AS user, "20190923" AS date),
("Bob", "20190923"),
("Carl", "20190923"),
("Adam", "20190924"),
("Bob", "20190924"),
("Adam", "20190925"),
("Carl", "20190925"),
("Bob", "20190926")
]) AS website_visits;
+------+----------+
| User | Date |
+------+----------+
| Adam | 20190923 |
| Bob | 20190923 |
| Carl | 20190923 |
| Adam | 20190924 |
| Bob | 20190924 |
| Adam | 20190925 |
| Carl | 20190925 |
| Bob | 20190926 |
+------+----------+
The above is a table of website visits.
Clearly, creating a pre-aggregated table like
SELECT date, COUNT(DISTINCT user) FROM website_visits GROUP BY date
has the limitation that the count cannot be aggregated further (or even less, dinamically) to get a total, as doing a SUM would return 8 unique users which is not correct, there are only 3 unique users.
In BigQuery, this is fixed by using HLL_COUNT, which despite the approximation works ok for me.
Now to the big question:
How to do the same so that the result is displayable in Data Studio????
HLL_COUNT.EXTRACT is not available as function in there, and in the reporting I always have to keep in mind that the date range is set by the user however (s)he likes so it's not possible to store a pre-aggregated result for ALL cases...
EDIT 1: APPROX_COUNT_DISTINCT
As per answer from Bobbylank, I tried to use APPROX_COUNT_DISTINCT.
However I found that this just seems to move the issue down the line. My fault for not explaining what's over there.
Despite being performances acceptable it does not seem possible to me to blend a data source with this calculated metric.
Example: After displaying the amount of unique users in the selected period (which now works), I'm also trying to display Average Revenue Per User (ARPU) in Data Studio like Firebase does.
To do this, I have to SUM(REVENUE) / APPROX_COUNT_DISTINCT(USER)
Clearly, REVENUE works ok with pre-aggregation and is available in the raw data. I tried then to blend the raw data with a table containing just user visits. However APPROX_COUNT_DISTINCT can't be used in the blended data definition as calculated metrics are not allowed.
Even trying to use the USER field as a metric with Count Distinct aggregation, despite returning the correct figures when showing revenue and user count separately, when I try to divide them the problem becomes aggregation (apply SUM or AVG to the field and basically the result will be AVG(REVENUE/USERS) for each day).
I also then tried to store REVENUE directly in the visits table, but was reminded by Data Studio that I can't create calculated metrics that I can't mix dimensions and metrics in a calculated field.
APPROX_COUNT_DISTINCT might be more performance friendly for you?
https://support.google.com/datastudio/answer/9189108?hl=en
Otherwise the only way I can think would be to pre-calculate several metrics (e.g. unique users on that day, 7-day cumulative, 14-day, etc.) as your customer require for each single day.
Or you could provide a 2 page report with both of these methods with the caveat that the first can be used over a time period but will be much slower?

Using application insights REST API for reading custom events

We have a custom event put in place on page which tracks the link clicks on given page to app insights. And with the REST API we would like to get the frequently accessed links from app insights.
How can we build the Query to get this analytics data, any sample on reading custom events available?
Thanks
if you open the Application Insights Analytics website for any resource, there's some "Common Queries" examples right on the front page. one of them is called "Usage" and if you click it it will show you this one:
//What are the top 10 custom events of your application in the past 24 hours?
customEvents
| where timestamp >= ago(24h)
| summarize dcount(user_Id), count() by name
| top 10 by count_
| render barchart
which:
queries customEvents,
filtering to the last 24 hours (timestamp >= ago(24h)),
does a summary of the distinct count of users (dcount(user_Id)) and the total number of events (count()), grouped by the event name (by name),
then filters to the top 10 by the _count field created from the summarization (top 10 by count_)
and then renders it as a bar chart (render barchart)
there are many other examples on the analytics home page as well.
Edit to add: You can easily query any custom properties or metrics that you send as well. the customDimensions and customMeasurements fields in each event type are json typed fields, and if there's no spaces in the names, you can just use dot notation to grab values. if the field has names/special characters, use brackets and quotes:
customEvents
| where timestamp >= ago(1h)
| extend a = customDimensions.NameOfFieldWithNoSpacesOrSpecialCharacters
| extend b = customDimensions["Field with spaces"]
| extend duration = customMeasurements["Duration (ms)"]
| project a, b, duration
| limit 10
(you don't need to use extend, you can use the fields however you want this way, with extend or project or summarize or any other functions or anything else. i just used extend for the example here.)

Resources