How do I group by Username in Sentinel? I get confused with the project, extend, join, and summerize operations. Sanitized KQL Code and Picture attached. This is for mass downloads of files, but number has been lowered to trigger for MSP.
let threshold = 100;
let szSharePointFileOperation = "SharePointFileOperation";
let szOperations = dynamic(["FileDownloaded"]);
let starttime = 10m;
let endtime = 600m;
let historicalActivity =
OfficeActivity
| where TimeGenerated between(ago(starttime)..ago(endtime))
| where RecordType =~ szSharePointFileOperation
| where Operation in~ (szOperations)
| summarize historicalCount = count() by ClientIP, RecordType, Operation;
let recentActivity = OfficeActivity
| where TimeGenerated > ago(endtime)
| where RecordType =~ szSharePointFileOperation
| where Operation in~ (szOperations)
| summarize min(Start_Time), max(Start_Time), recentCount = count() by ClientIP, RecordType, Operation;
let RareIP = recentActivity | join kind= leftanti ( historicalActivity ) on ClientIP, RecordType, Operation
// More than 100 downloads/uploads from a new IP
| where recentCount > threshold;
OfficeActivity
| where TimeGenerated >= ago(endtime)
| where RecordType =~ szSharePointFileOperation
| where Operation in~ (szOperations)
| join kind= inner (RareIP) on ClientIP, RecordType, Operation
| where Start_Time between(min_Start_Time .. max_Start_Time)
| summarize StartTimeUtc = min(min_Start_Time), EndTimeUtc = max(max_Start_Time) by RecordType, Operation, UserType, UserId, ClientIP, OfficeWorkload, Site_Url, OfficeObjectId, UserAgent, IPSeenCount = recentCount
| extend timestamp = StartTimeUtc, AccountCustomEntity = UserId, IPCustomEntity = ClientIP, URLCustomEntity = Site_Url
| order by IPSeenCount desc, ClientIP asc, Operation asc, UserId asc
| where UserAgent <> "FileZip/1.0"
| where Site_Url contains "https://abc.sharepoint.com/sites/"
Results in image: https://ibb.co/QjW308q. I am trying to group by UserIDs or "AccountCustomIdentity" and not have seperate rows for each. Many thanks for any help!
If you are not interested to see the userIds, you can simply remove it from the "summarize" line here (this is the applicable line without it):
| summarize StartTimeUtc = min(min_Start_Time), EndTimeUtc = max(max_Start_Time) by RecordType, Operation, UserType, ClientIP, OfficeWorkload, Site_Url, OfficeObjectId, UserAgent, IPSeenCount = recentCount
Once you do this, these line will show and error under the UserId expressions:
| extend timestamp = StartTimeUtc, AccountCustomEntity = UserId, IPCustomEntity = ClientIP, URLCustomEntity = Site_Url
| order by IPSeenCount desc, ClientIP asc, Operation asc, UserId asc
You should remove these expressions from these lines as well and run the query. The UserIds will be gone.
As a side note, UserIds in this query is duplicated and show up twice in the results, once as UserId and once as AccountCustomEntity - this is strange.
Related
I am trying to write a kusto query to retrieve a custom property as below.
I want to retrieve count of pkgName and corresponding organization. I could retrieve the count of pkgName and the code is attached below.
let mainTable = union customEvents
| extend name =replace("\n", "", name)
| where iif('*' in ("*"), 1 == 1, name in ("*"))
| where true;
let queryTable = mainTable;
let cohortedTable = queryTable
| extend dimension = customDimensions["pkgName"]
| extend dimension = iif(isempty(dimension), "<undefined>", dimension)
| summarize hll = hll(itemId) by tostring(dimension)
| extend Events = dcount_hll(hll)
| order by Events desc
| serialize rank = row_number()
| extend dimension = iff(rank > 10, 'Other', dimension)
| summarize merged = hll_merge(hll) by tostring(dimension)
| project ['pkgName'] = dimension, Counts = dcount_hll(merged);
cohortedTable
Please help me to get the organization along with each pkgName projected.
Please try this simple query:
customEvents
| summarize counts=count(tostring(customDimensions.pkgName)) by pkgName=tostring(customDimensions.pkgName),organization=tostring(customDimensions.organization)
Please feel free to modify it to meet your requirement.
If the above does not meet your requirement, please try to create another table which contains pkgName and organization relationship. Then use join operator to join these tables. For example:
//create a table which contains the relationship
let temptable = customEvents
| summarize by pkgName=tostring(customDimensions.pkgName),organization=tostring(customDimensions.organization);
//then use the join operator to join these tables on the keyword pkgName.
I m getting database name from let statement (dbname) , the issue im getting blank output ,but when I pass the db name [when I give as hardcoded value, the query is working] . please help me to understand what the issue in query.
let view=datatable(Property:string,Value:dynamic)[];
let viewFile=datatable(FileName:string)[];
let dbnameview=datatable(dbname:string)[];
alias database db = cluster(X).database('');
let dbname=tostring(toscalar((
union isfuzzy=true dbnameview, cluster(X).database('$systemdb').Operations
| where Operation == "DatabaseCreate" and Database contains "oci-"| where State =='Completed'
and StartedOn between (datetime(2020-04-09) .. 1d)
| distinct Database , StartedOn
| order by StartedOn desc
| take 1 )));
//let dbname= 'Y';
let latestInfoFile = toscalar((
union isfuzzy=true viewFile,cluster(X).database(dbname).['TextFileLogs']
| where FileName contains "AzureStackStampInformation"
| distinct FileName
| order by FileName
| take 1)) ;
union isfuzzy=true view,(
cluster(X).database(dbname).['TextFileLogs']
| where FileName == latestInfoFile
| distinct LineNumber,FileLineContent
| order by LineNumber asc
| summarize StampInfo=(toobject(strcat_array(makelist(FileLineContent,100000), "\r\n")))
| mvexpand bagexpansion=array StampInfo
| project Property=tostring(StampInfo[0]), Value=StampInfo[1]
)|where Property contains "StampVersion" | project BuildNumber = Value;
what you're attempting to do isn't supported, as mentioned in the docs: https://learn.microsoft.com/en-us/azure/kusto/query/databasefunction
I am trying to write a query that will get me the average time between 2 custom events, sorted by user session. I have added custom tracking events throughout this application and I want to query the time it takes the user from 'Setup' event to 'Process' event.
let allEvents=customEvents
| where timestamp between (datetime(2019-09-25T15:57:18.327Z)..datetime(2019-09-25T16:57:18.327Z))
| extend SourceType = 5;
let allPageViews=pageViews
| take 0;
let all = allEvents
| union allPageViews;
let step1 = materialize(all
| where name == "Setup" and SourceType == 5
| summarize arg_min(timestamp, *) by user_Id
| project user_Id, step1_time = timestamp);
let step2 = materialize(step1
| join
hint.strategy=broadcast (all
| where name == "Process" and SourceType == 5
| project user_Id, step2_time=timestamp
)
on user_Id
| where step1_time < step2_time
| summarize arg_min(step2_time, *) by user_Id
| project user_Id, step1_time,step2_time);
let 1Id=step1_time;
let 2Id=step2_time;
1Id
| union 2Id
| summarize AverageTimeBetween=avg(step2_time - step1_time)
| project AverageTimeBetween
When I run this query it produces this error message:
'' operator: Failed to resolve table or column or scalar expression named 'step1_time'
I am relatively new to writing queries with AI and have not found many resources to assist with this problem. Thank you in advance for your help!
I'm not sure what the let 1id=step1_time lines are intended to do.
those lines are trying to declare a new value, but step1_time isn't a thing, it was a field in another query
i'm also not sure why you're doing that pageviews | take 0 and unioning it with events?
let allEvents=customEvents
| where timestamp between (datetime(2019-09-25T15:57:18.327Z)..datetime(2019-09-25T16:57:18.327Z))
| extend SourceType = 5;
let step1 = materialize(allEvents
| where name == "Setup" and SourceType == 5
| summarize arg_min(timestamp, *) by user_Id
| project user_Id, step1_time = timestamp);
let step2 = materialize(step1
| join
hint.strategy=broadcast (allEvents
| where name == "Process" and SourceType == 5
| project user_Id, step2_time=timestamp
)
on user_Id
| where step1_time < step2_time
| summarize arg_min(step2_time, *) by user_Id
| project user_Id, step1_time,step2_time);
step2
| summarize AverageTimeBetween=avg(step2_time - step1_time)
| project AverageTimeBetween
if I remove the things I don't understand (like union with 0 pageviews, and the lets, I get a result, but I don't have your data so I had to use other values than "Setup" and "Process" so I don't know if it is what you expect?
you might want to look at the results of the step2 query without the summarize to just see what you're getting matches what you expect.
User wants a count of unique sessions per week in application insights. I have the query working, including a pivot, but the Week columns are out of order. I would prefer if they were in order.
pageViews
| where timestamp < now()
| summarize Sessions= dcount(session_Id)
by Week=bin(datepart("weekOfYear", timestamp), 1), user_AuthenticatedId
| order by Week
| evaluate pivot(Week, sum(Sessions))
| join kind=innerunique (pageViews
| summarize MostRecentRequest = max(timestamp) by user_AuthenticatedId)
on $right.user_AuthenticatedId == $left.user_AuthenticatedId
| project-away user_AuthenticatedId1
I've tried ordering by timestamp before the summarize, and ordering by week after the summarize (still in there) and no luck.
There's currently a "trick" that will work: serialize right after your order by
pageViews
| where timestamp < now()
| where isnotempty(user_AuthenticatedId)
| summarize Sessions= dcount(session_Id)
by Week=bin(datepart("weekOfYear", timestamp), 1), user_AuthenticatedId
| order by Week
| serialize // <--------------------------------- RIGHT HERE
| evaluate pivot(Week, sum(Sessions))
| join kind=innerunique (pageViews
| summarize TotalSessions=dcount(session_Id), MostRecentRequest = max(timestamp) by user_AuthenticatedId)
on $right.user_AuthenticatedId == $left.user_AuthenticatedId
| project-away user_AuthenticatedId1
| top 100 by TotalSessions desc
gets me this in workbooks, with the weeks in descending order (I also added total session count to sort/top by with some custom column settings set):
the custom settings I have for the column settings in workbooks:
delete all the #'d columns that are there by default and add one for ^[0-9]+$ set to heatmap:
I refactored query a bit for my own comprehension. I took the the left and right into "views". Thought I'd share.
let users_MostRecent_Session =
pageViews
| summarize
TotalSessions=dcount(session_Id)
, MostRecentRequest = max(timestamp)
by
user_AuthenticatedId
;
//
let users_sessions_ByWeek =
pageViews
| where timestamp < now()
| where isnotempty(user_AuthenticatedId)
| summarize
Sessions= dcount(session_Id)
by
Week=bin(datepart("weekOfYear", timestamp), 1)
, user_AuthenticatedId
| order by Week
| serialize
| evaluate pivot(Week, sum(Sessions))
;
//
//
users_sessions_ByWeek
| join kind=innerunique
users_MostRecent_Session
on user_AuthenticatedId
| project-away user_AuthenticatedId1
| top 100 by TotalSessions desc
I'm trying to perform a very simple query for Firebase events stored in Google BigQuery but I´m not able to find a way to do it.
In the Android app, I´m logging an event like this:
Bundle params = new Bundle();
params.putInt("productID", productId);
params.putInt(FirebaseAnalytics.Param.VALUE, value);
firebaseAnalytics.logEvent("productEvent", params);
So, in BigQuery I have something like this:
___________________ _______________________ ____________________________
| event_dim.name | event_dim.params.key | event_dim.params.int_value |
|___________________|_______________________|____________________________|
| productEvent | productID | 25 |
| |_______________________|____________________________|
| | value | 1253 |
|___________________|_______________________|____________________________|
When I get the data from this table I get two rows:
___________________ _______________________ ____________________________
|event_dim.name | event_dim.params.key | event_dim.params.int_value |
|___________________|_______________________|____________________________|
| productEvent | productID | 25 |
| productEvent | value | 12353 |
But what I really need is a SELECT clause from this table to get the data as below:
___________________ _____________ _________
| name | productID | value |
|___________________|_____________|_________|
| productEvent | 25 | 12353 |
Any idea or suggestion?
You can pivot the values into columns like this
SELECT
event_dim.name as name,
MAX(IF(event_dim.params.key = "productID", event_dim.params.int_value, NULL)) WITHIN RECORD productID,
MAX(IF(event_dim.params.key = "value", event_dim.params.int_value, NULL)) WITHIN RECORD value,
FROM [events]
In case you want to generate this command using SQL, see this solution: Pivot Repeated fields in BigQuery
Using standard SQL (uncheck "Use Legacy SQL" under "Show Options" in the UI), you can express the query as:
SELECT
event_dim.name as name,
(SELECT value.int_value FROM UNNEST(event_dim.params)
WHERE key = "productID") AS productID,
(SELECT value.int_value FROM UNNEST(event_dim.params)
WHERE key = "value") AS value
FROM `dataset.mytable` AS t,
t.event_dim AS event_dim;
Edit: updated example to include int_value as part of value based on the comment below. Here is a self-contained example that demonstrates the approach as well:
WITH T AS (
SELECT ARRAY_AGG(event_dim) AS event_dim
FROM (
SELECT STRUCT(
"foo" AS name,
ARRAY<STRUCT<key STRING, value STRUCT<int_value INT64, string_value STRING>>>[
("productID", (10, NULL)), ("value", (5, NULL))
] AS params) AS event_dim
UNION ALL
SELECT STRUCT(
"bar" AS name,
ARRAY<STRUCT<key STRING, value STRUCT<int_value INT64, string_value STRING>>>[
("productID", (13, NULL)), ("value", (42, NULL))
] AS params) AS event_dim
)
)
SELECT
event_dim.name as name,
(SELECT value.int_value FROM UNNEST(event_dim.params)
WHERE key = "productID") AS productID,
(SELECT value.int_value FROM UNNEST(event_dim.params)
WHERE key = "value") AS value
FROM T AS t,
t.event_dim AS event_dim;