Parsing NSG Flowlogs in Azure Log Analytics Workspace to separate Public IP addresses - azure-data-explorer

I have been updating a KQL query for use in reviewing NSG Flow Logs to separate the columns for Public/External IP addresses. However the data within each cell of the column contains additional information that needs to be parsed out so my excel addin can run NSLOOKUP against each cell and looking for additional insights. Later I would like to use the parse operator (https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/parseoperator) to separate this information to determine what that external IP address belongs to through nslookup, resolve-dnsname, whois , or other means.
However currently I am attempting to parse out the column, but is not comma delimited and instead uses a single space and multiple pipes. Below is my query and I would like to add a parse to this to either have a comma delimited string in a single cell [ for PublicIP (combination of Source and Destination), PublicSourceIP, and PublicDestIP. ] or break it out into multiple rows. How would parse be best used to separate this information, or is there a better operator to use to carry this out?
For Example the content could look like this
"20.xx.xx.xx|1|0|0|0|0|0 78.xxx.xxx.xxx|1|0|0|0|0|0"
AzureNetworkAnalytics_CL
| where SubType_s == 'FlowLog' and (FASchemaVersion_s == '1'or FASchemaVersion_s == '2')
| extend NSG = NSGList_s, Rule = NSGRule_s,Protocol=L4Protocol_s, Hits = (AllowedInFlows_d + AllowedOutFlows_d + DeniedInFlows_d + DeniedOutFlows_d)
| project-away NSGList_s, NSGRule_s
| project TimeGenerated, NSG, Rule, SourceIP = SrcIP_s, DestinationIP = DestIP_s, DestinationPort = DestPort_d, FlowStatus = FlowStatus_s, FlowDirection = FlowDirection_s, Protocol=L4Protocol_s, PublicIP=PublicIPs_s,PublicSourceIP = SrcPublicIPs_s,PublicDestIP=DestPublicIPs_s
// ## IP Address Filtering ##
| where isnotempty(PublicIP)
**| parse kind = regex PublicIP with * "|1|0|0|0|0|0" ipnfo ' ' *
| project ipnfo**
// ## port filtering
| where DestinationPort == '443'

Based on extract_all() followed by strcat_array() or mv-expand
let AzureNetworkAnalytics_CL = datatable (RecordId:int, PublicIPs_s:string)
[
1 ,"51.105.236.244|2|0|0|0|0|0 51.124.32.246|12|0|0|0|0|0 51.124.57.242|1|0|0|0|0|0"
,2 ,"20.44.17.10|6|0|0|0|0|0 20.150.38.228|1|0|0|0|0|0 20.150.70.36|2|0|0|0|0|0 20.190.151.9|2|0|0|0|0|0 20.190.151.134|1|0|0|0|0|0 20.190.154.137|1|0|0|0|0|0 65.55.44.109|2|0|0|0|0|0"
,3 ,"20.150.70.36|1|0|0|0|0|0 52.183.220.149|1|0|0|0|0|0 52.239.152.234|2|0|0|0|0|0 52.239.169.68|1|0|0|0|0|0"
];
// Option 1
AzureNetworkAnalytics_CL
| project RecordId, PublicIPs = strcat_array(extract_all("(?:^| )([^|]+)", PublicIPs_s),',');
// Option 2
AzureNetworkAnalytics_CL
| mv-expand with_itemindex=i PublicIP = extract_all("(?:^| )([^|]+)", PublicIPs_s) to typeof(string)
| project RecordId, i = i+1, PublicIP
Fiddle
Option 1
RecordId
PublicIPs
1
51.105.236.244,51.124.32.246,51.124.57.242
2
20.44.17.10,20.150.38.228,20.150.70.36,20.190.151.9,20.190.151.134,20.190.154.137,65.55.44.109
3
20.150.70.36,52.183.220.149,52.239.152.234,52.239.169.68
Option 2
RecordId
i
PublicIP
1
1
51.105.236.244
1
2
51.124.32.246
1
3
51.124.57.242
2
1
20.44.17.10
2
2
20.150.38.228
2
3
20.150.70.36
2
4
20.190.151.9
2
5
20.190.151.134
2
6
20.190.154.137
2
7
65.55.44.109
3
1
20.150.70.36
3
2
52.183.220.149
3
3
52.239.152.234
3
4
52.239.169.68

David answers your question. I would just like to add that I worked on the raw NSG Flow Logs and parsed them using kql in this way:
The raw JSON:
{"records":[{"time":"2022-05-02T04:00:48.7788837Z","systemId":"x","macAddress":"x","category":"NetworkSecurityGroupFlowEvent","resourceId":"/SUBSCRIPTIONS/x/RESOURCEGROUPS/x/PROVIDERS/MICROSOFT.NETWORK/NETWORKSECURITYGROUPS/x","operationName":"NetworkSecurityGroupFlowEvents","properties":{"Version":2,"flows":[{"rule":"DefaultRule_DenyAllInBound","flows":[{"mac":"x","flowTuples":["1651463988,0.0.0.0,192.168.1.6,49944,8008,T,I,D,B,,,,"]}]}]}}]}
kql parsing:
| mv-expand records
| evaluate bag_unpack(records)
| extend flows = properties.flows
| mv-expand flows
| evaluate bag_unpack(flows)
| mv-expand flows
| extend flowz = flows.flowTuples
| mv-expand flowz
| extend result=split(tostring(flowz), ",")
| extend source_ip=tostring(result[1])
| extend destination_ip=tostring(result[2])
| extend source_port=tostring(result[3])
| extend destination_port=tostring(result[4])
| extend protocol=tostring(result[5])
| extend traffic_flow=tostring(result[6])
| extend traffic_decision=tostring(result[7])
| extend flow_state=tostring(result[8])
| extend packets_src_to_dst=tostring(result[9])
| extend bytes_src_to_dst=tostring(result[10])
| extend packets_dst_to_src=tostring(result[11])
| extend bytes_dst_to_src=tostring(result[12])

Related

Parse data in Kusto

I am trying to parse the below data in Kusto. Need help.
[[ObjectCount][LinkCount][DurationInUs]]
[ChangeEnumeration][[88][9][346194]]
[ModifyTargetInLive][[3][6][595903]]
Need generic implementation without any hardcoding.
ideally - you'd be able to change the component that produces source data in that format to use a standard format (e.g. CSV, Json, etc.) instead.
The following could work, but you should consider it very inefficient
let T = datatable(s:string)
[
'[[ObjectCount][LinkCount][DurationInUs]]',
'[ChangeEnumeration][[88][9][346194]]',
'[ModifyTargetInLive][[3][6][595903]]',
];
let keys = toscalar(
T
| where s startswith "[["
| take 1
| project extract_all(#'\[([^\[\]]+)\]', s)
);
T
| where s !startswith "[["
| project values = extract_all(#'\[([^\[\]]+)\]', s)
| mv-apply with_itemindex = i keys on (
extend Category = tostring(values[0]), p = pack(tostring(keys[i]), values[i + 1])
| summarize b = make_bag(p) by Category
)
| project-away values
| evaluate bag_unpack(b)
--->
| Category | ObjectCount | LinkCount | DurationInUs |
|--------------------|-------------|-----------|--------------|
| ChangeEnumeration | 88 | 9 | 346194 |
| ModifyTargetInLive | 3 | 6 | 595903 |

Cumulative count of occurrences per value in array in Kusto

I'm looking to get the count of query param usage from the query string from page views stored in app insights using KQL. My query currently looks like:
pageViews
| project parsed=parseurl(url)
| project keys=bag_keys(parsed["Query Parameters"])
and the results look like
with each row looking like
I'm looking to get the count of each value in the list when it is contained in the url in order to anwser the question "How many times does page appear in the querystring". So the results might look like:
Page | From | ...
1000 | 67 | ...
Thanks in advance
you could try something along the following lines:
datatable(url:string)
[
"https://a.b.c/d?p1=hello&p2=world",
"https://a.b.c/d?p2=world&p3=foo&p4=bar"
]
| project parsed = parseurl(url)
| project keys = bag_keys(parsed["Query Parameters"])
| mv-expand key = ['keys'] to typeof(string)
| summarize count() by key
which returns:
| key | count_ |
|-----|--------|
| p1 | 1 |
| p2 | 2 |
| p3 | 1 |
| p4 | 1 |

Activiti and candidate groups

In APS 1.8.1, I have defined a process where each task has a candidate group.
When I login in with a user that belongs to a candidate group, I cannot see the process instance.
I have found out that when I try to access the process instances, APS executes the following query in the database:
select distinct RES.* , DEF.KEY_ as PROC_DEF_KEY_, DEF.NAME_ as PROC_DEF_NAME_, DEF.VERSION_ as PROC_DEF_VERSION_, DEF.DEPLOYMENT_ID_ as DEPLOYMENT_ID_
from ACT_HI_PROCINST RES
left outer join ACT_RE_PROCDEF DEF on RES.PROC_DEF_ID_ = DEF.ID_
left join ACT_HI_IDENTITYLINK I_OR0 on I_OR0.PROC_INST_ID_ = RES.ID_
WHERE RES.TENANT_ID_ = 'tenant_1'
and
( (
exists(select LINK.USER_ID_ from ACT_HI_IDENTITYLINK LINK where USER_ID_ = '1003' and LINK.PROC_INST_ID_ = RES.ID_)
)
or (
I_OR0.TYPE_ = 'participant'
and
I_OR0.GROUP_ID_ IN ('1','2','2023','2013','2024','2009','2025','2026','2027','2028','2029','2007','2018','2020','2017','2015','2012','2003','2021','2019','2004','2002','2005','2030','2031','2032','2011','2006','2008','2014','2010','2016','2022','2033','2034','2035','2036','2037','1003')
) )
order by RES.START_TIME_ desc
LIMIT 50 OFFSET 0
This query does not return any record for two reasons:
In my ACT_HI_IDENTITYLINK no tasks have both the group_id_ and the proc_inst_id_ set.
The type of the record is "candidate" but the query is looking for "participant"
select * fro m ACT_HI_IDENTITYLINK;
-[ RECORD 1 ]-+----------
id_ | 260228
group_id_ |
type_ | starter
user_id_ | 1002
task_id_ |
proc_inst_id_ | 260226
-[ RECORD 2 ]-+----------
id_ | 260294
group_id_ | 2006
type_ | candidate
user_id_ |
task_id_ | 260293
proc_inst_id_ |
-[ RECORD 3 ]-+----------
id_ | 260300
group_id_ | 2009
type_ | candidate
user_id_ |
task_id_ | 260299
proc_inst_id_ |
-[ RECORD 4 ]-+----------
id_ | 262503
group_id_ |
type_ | starter
user_id_ | 1002
task_id_ |
proc_inst_id_ | 262501
-[ RECORD 5 ]-+----------
id_ | 262569
group_id_ | 2016
type_ | candidate
user_id_ |
task_id_ | 262568
proc_inst_id_ |
-[ RECORD 6 ]-+----------
id_ | 262575
group_id_ | 2027
type_ | candidate
user_id_ |
task_id_ | 262574
proc_inst_id_ |
Why the query is looking only for "participant" and why the records that have type_ = 'candidate' do not have any proc_inst_id_ set ?
UPDATE:
The problem with the constraint "participant" has a simple workaround: it would be enough to add the same candidate group as a participant.
See also Feature allowing "Participant" configuration in BPM Modeler
Unfortunately this is not enough to solve the second problem. The record is still not returned because the column proc_inst_id_ is not set.
I tried to update the column manually on the "participant" record and I have verified that doing so the page is accessible and works well.
Does anyone know why the column is not set ?
A possible solution (or workaround until ACTIVITI-696 is fixed) is to add each group added as candidate to a task as a participant of the process instance.
There is a REST API that does it:
POST /enterprise/process-instances/{processInstanceId}/identitylinks
What this API does should be done by a task listener that will automatically add the candidate groups of the created task as participant of the process instance.
To add the new identity link, use in the listener the following lines:
ActivitiEntityEvent aee = (ActivitiEntityEvent)activitiEvent;
TaskEntity taskEntity = (TaskEntity)aee.getEntity();
List<IdentityLinkEntity> identities = taskEntity.getIdentityLinks();
if (identities != null) {
for (IdentityLinkEntity identityLinkEntity : identities) {
String groupId = identityLinkEntity.getGroupId();
runtimeService.addGroupIdentityLink(activitiEvent.getProcessInstanceId(), groupId, "participant");
};
}
first try to check that your workflow is really started by access to "workflow I have started". You should see your task in "active task" if not, that means there is some errors in your definitions. If everything is ok, check your group name and don’t forget to add "GROUP_"myGRPName.
If you want to see the workflow instances it’s simpler with web script and services.

Compare stored values in Selenium IDE

I am new to test automation and to Selenium IDE. With Selenium IDE, I want to store two values(integer) and compare them. Test passes if the compared result is greater than or equal to zero. So far, I only found an option to store the values and wondering if there is any option to compare the stored values.
Any suggestion would be helpful.
Thanks
Okay, assuming you're always subtracting A (constant value) from B(variable value), you can use some javascript to perform the test.
store | 2 | A
store | 4 | B
storeEval | var s = false; s = eval((storedVars['B'] - storedVars['A']) >=0); | s
verifyExpression | ${s}
replace the two store steps above with whatever you use to get your variables A and B.
The verifyExpression line will pass(return true) if result is greater than or equal to zero, will fail(stay false) if not.
store |2| A
store |4| B
storeEval |var s = false; s = eval((storedVars['B'] - storedVars['A']) >=0);| s
echo |${s}|
Executing: |store | 2 | A |
Executing: |store | 4 | B |
Executing: |storeEval | var s = false; s = eval((storedVars['B'] - storedVars['A']) >=0); | s |
script is: var s = false; s = eval((storedVars['B'] - storedVars['A']) >=0);
Executing: |echo | ${s} | |
echo: true
Test case passed

SQLite3: dynamic between query

I have this sqlite3 table (simplified):
+--------+----------+-------+
| ROUTE | WPNumber | WPID |
+--------+----------+-------+
| A123 | 1 | WP001 |
| A123 | 2 | WP002 |
| A123 | 3 | WP003 |
| [...] | [...] | [...] |
| A123 | 20 | WP020 |
+--------+----------+-------+
Lets say I want to travel this route in the reverse direction (020 to 001).
How do I get all the WPID's in between? I know it is possible to build a query using BETWEEN and DESC, but then I'd have to build two seperate queries and have Python check when to use which query. Is it possible to have sqlite3 do the work, independent of the direction (reverse or not).
You can reverse the sorting order by reversing the number used in the ORDER BY clause.
Set the parameter ? to either 1 or -1:
SELECT WPID
FROM ThisTable
WHERE ROUTE = 'A123'
ORDER BY WPNumber * ?
If you would just use a similar query with DESC, the database would have a better opportunity to optimize the sorting with an index.

Resources