Measuring the success rate of a command executed using Kusto Query - azure-data-explorer

I am trying to find the success rate of a command (e.g call). I have scenario markers in place that marks the success, and the data is collected. Now I am using the Kusto queries to create a dashboard that measures the success rate when the command is triggered.
I was trying to use percentiles to measure the success rate of the command that was being used over a period of time as below.
Table
| where Table_Name == "call_command" and Table_Step == "CommandReceived"
| parse Table_MetaData with * "command = " command: string "," *
| where command == "call"
| summarize percentiles(command, 5, 50, 95) by Event_Time
The above query throws an error as "recognition error" occurred. Also, is this the right way to find the success rate of the command.
Updated :
Successfull command o/p :
call_command CommandReceived OK null null 1453 null [command = call,id = b444,retryAttempt = 0] [null] [null]
Unsuccessfull command o/p :
call_command STOP ERROR INVALID_VALUE Failed to execute command: call, id: b444, status code: 0, error code: INVALID_VALUE, error details: . 556 [command = call,id = b444,retryAttempt = 0] [null] [null]
Table name - call_command
Table_step - CommandReceived/STOP
Table_Metadata - [command = call,id = b444,retryAttempt = 0]
Table_status - OK/ERROR

Percentiles require that the first argument is numeric/bool/timespan/datetime, a string argument is not valid. It seems that the first step is to extract whether a call was successful, once you have such a column you can calculate the percentiles for it. Here is an example similar to your use-case:
let Table = datatable(Event_Time:datetime, Table_MetaData:string) [datetime(2021-05-01),"call_command CommandReceived OK null null 1453 null [command = call,id = b444,retryAttempt = 0] [null] [null] "
,datetime(2021-05-01),"call_command CommandReceived OK null null 1453 null [command = call,id = b444,retryAttempt = 0] [null] [null] "
,datetime(2021-05-01),"call_command STOP ERROR INVALID_VALUE Failed to execute command: call, id: b444, status code: 0, error code: INVALID_VALUE, error details: . 556 [command = call,id = b444,retryAttempt = 0] [null] [null]"]
| extend CommandStatus = split(Table_MetaData, " ")[2]
| extend Success = iif(CommandStatus == "OK", true, false)
| parse Table_MetaData with * "command = " command: string "," *
| where command == "call"
| summarize percentiles(Success, 5, 50, 95) by bin(Event_Time,1d);
Table

Related

Syntax error: SYN0001 despite it working on the Kusto query editor online

I'm building queries in Python and executing them on my Kusto clusters using Kusto client's execute_query method.
I've been hit by the following error: azure.kusto.data.exceptions.KustoApiError: Request is invalid and cannot be processed: Syntax error: SYN0001: I could not parse that, sorry. [line:position=0:0].
However, when debugging, I've taken the query as it is, and ran it on my clusters thru the Kusto platform on Azure.
The query is similar to the following:
StormEvents
| where ingestion_time() > ago(1h)
| summarize
count_matching_regex_State=countif(State matches regex "[A-Z]*"),
count_not_empty_State=countif(isnotempty(State))
| summarize
Matching_State=sum(count_matching_regex_State),
NotEmpty_State=sum(count_not_empty_State)
| project
ratio_State=todouble(Matching_State) / todouble(Matching_State + NotEmpty_State)
| project
ratio_State=iff(isnan(ratio_State), 0.0, round(ratio_State, 3))
Queries are built in Python using string interpolations and such:
## modules.py
def match_regex_query(fields: list, regex_patterns: list, kusto_client):
def match_regex_statements(field, regex_patterns):
return " or ".join(list(map(lambda pattern: f"{field} matches regex \"{pattern}\"", regex_patterns)))
count_regex_statement = list(map(
lambda field: f"count_matching_regex_{field} = countif({match_regex_statements(field, regex_patterns)}), count_not_empty_{field} = countif(isnotempty({field}))", fields))
count_regex_statement = ", ".join(count_regex_statement)
summarize_sum_statement = list(map(lambda field: f"Matching_{field} = sum(count_matching_regex_{field}), NotEmpty_{field} = sum(count_not_empty_{field})", fields))
summarize_sum_statement = ", ".join(summarize_sum_statement)
project_ratio_statement = list(map(lambda field: f"ratio_{field} = todouble(Matching_{field})/todouble(Matching_{field}+NotEmpty_{field})", fields))
project_ratio_statement = ", ".join(project_ratio_statement)
project_round_statement = list(map(lambda field: f"ratio_{field} = iff(isnan(ratio_{field}),0.0,round(ratio_{field}, 3))", fields))
project_round_statement = ", ".join(project_round_statement)
query = f"""
StormEvents
| where ingestion_time() > ago(1h)
| summarize {count_regex_statement}
| summarize {summarize_sum_statement}
| project {project_ratio_statement}
| project {project_round_statement}
"""
clean_query = query.replace("\n", " ").strip()
try:
result = kusto_client.execute_query("Samples", clean_query)
except Exception as err:
logging.exception(
f"Error while computing regex metric : {err}")
result = []
return result
## main.py
#provide your kusto client here
cluster = "https://help.kusto.windows.net"
kcsb = KustoConnectionStringBuilder.with_interactive_login(cluster)
client = KustoClient(kcsb)
fields = ["State"]
regex_patterns = ["[A-Z]*"]
metrics = match_regex_query(fields, regex_patterns, client)
Is there a better way to debug this problem?
TIA!
the query your code generates is invalid, as the regular expressions include characters that aren't properly escaped.
see: the string data type
this is your invalid query (based on the client request ID you provided in the comments):
LiveStream_CL()
| where ingestion_time() > ago(1h)
| summarize count_matching_regex_deviceHostName_s = countif(deviceHostName_s matches regex "^[a-zA-Z0-9\$]([a-zA-Z0-9\-\_\.\$]{0,61}[a-zA-Z0-9\$])?$"), count_not_empty_deviceHostName_s = countif(isnotempty(deviceHostName_s)), count_matching_regex_sourceHostName_s = countif(sourceHostName_s matches regex "^[a-zA-Z0-9\$]([a-zA-Z0-9\-\_\.\$]{0,61}[a-zA-Z0-9\$])?$"), count_not_empty_sourceHostName_s = countif(isnotempty(sourceHostName_s)), count_matching_regex_destinationHostName_s = countif(destinationHostName_s matches regex "^[a-zA-Z0-9\$]([a-zA-Z0-9\-\_\.\$]{0,61}[a-zA-Z0-9\$])?$"), count_not_empty_destinationHostName_s = countif(isnotempty(destinationHostName_s)), count_matching_regex_agentHostName_s = countif(agentHostName_s matches regex "^[a-zA-Z0-9\$]([a-zA-Z0-9\-\_\.\$]{0,61}[a-zA-Z0-9\$])?$"), count_not_empty_agentHostName_s = countif(isnotempty(agentHostName_s))
...
whereas this is how it should look like (note the addition of the #s):
LiveStream_CL()
| where ingestion_time() > ago(1h)
| summarize
count_matching_regex_deviceHostName_s = countif(deviceHostName_s matches regex #"^[a-zA-Z0-9\$]([a-zA-Z0-9\-\_\.\$]{0,61}[a-zA-Z0-9\$])?$"),
count_not_empty_deviceHostName_s = countif(isnotempty(deviceHostName_s)),
count_matching_regex_sourceHostName_s = countif(sourceHostName_s matches regex #"^[a-zA-Z0-9\$]([a-zA-Z0-9\-\_\.\$]{0,61}[a-zA-Z0-9\$])?$"),
count_not_empty_sourceHostName_s = countif(isnotempty(sourceHostName_s)),
count_matching_regex_destinationHostName_s = countif(destinationHostName_s matches regex #"^[a-zA-Z0-9\$]([a-zA-Z0-9\-\_\.\$]{0,61}[a-zA-Z0-9\$])?$"),
count_not_empty_destinationHostName_s = countif(isnotempty(destinationHostName_s)),
count_matching_regex_agentHostName_s = countif(agentHostName_s matches regex #"^[a-zA-Z0-9\$]([a-zA-Z0-9\-\_\.\$]{0, 61}[a-zA-Z0-9\$])?$"),
count_not_empty_agentHostName_s = countif(isnotempty(agentHostName_s))
...

How to use unittest module for a python script which calls mutiple functions and uses global variables

This is one of the function in my python script for which I am trying to write unit test case, since it uses global variables and audit and big query functions which is written as different utility scripts I am not understanding how to write #patch and execute unit test cases for the same.
How will I patch global variables?
How to patch functions which doesn't have any return for eg :audit_event_source_table, can we ignore such functions during unit testing ? if so how to do the same?
How to do assertion as I do not have any return value but have logger.info messages?
import logging
from datetime import datetime
from pathlib import Path
import sys
import __main__
from intient_research_rdm_common.utils.audit_utils import audit_event_source_table, audit_event_job_table, \
get_job_id, get_source_object_id
from intient_research_rdm_kg_core.common_utils.utils.bigquery_utils import bigquery_data_read
from intient_research_rdm_kg_core.common_utils.utils.conf_read import read_args, read_source_config, read_env_config
global project_id, service_account, conn_ip, debug, node_table_list, edge_table_list, source_name
def edge_validation():
global edge_table_list
global source_name
edge_table_na = []
edge_table_list_rowcount_zero = []
dataset_e = "prep_e_" + source_name
row_count = 0
edge_table = ""
source_object_start_timestamp = datetime.now()
source_object_id = get_source_object_id(source_name, source_object_start_timestamp)
source_object_type = AUDIT_SOURCE_OBJECT_TYPE_BIGQUERY
job_id = get_job_id(source_object_start_timestamp)
source_object_name = dataset_e
try:
for edge_table in edge_table_list:
sql_query = " SELECT * FROM " + "`" + project_id + "." + dataset_e + ".__TABLES__` WHERE table_id =" + "'" + edge_table + "'"
data_read, col_names = bigquery_data_read(service_account, sql_query, project_id)
for ind in data_read.index:
row_count = (data_read['row_count'][ind])
if len(data_read.index) == 0:
edge_table_na.append(edge_table)
elif row_count == 0:
edge_table_list_rowcount_zero.append(edge_table)
if len(edge_table_na) > 0:
logging.info("Missing Edge tables in preprocessing layer {} ".format(edge_table_na))
if len(edge_table_list_rowcount_zero) > 0:
logging.info("Edge tables with row count as zero in Pre-processing layer {} ".format(edge_table_list_rowcount_zero))
if len(edge_table_na) == 0 and len(edge_table_list_rowcount_zero) == 0:
logging.info(
"Edge list validation for the source {} has been successfully completed with no discrepancies".format(
source_name))
audit_event_source_table(source_object_id, source_object_name, source_object_type, source_name,
source_object_name,
job_id, AUDIT_JOB_STATUS_PASS, source_object_start_timestamp,
datetime.now(), 'NA', 'NA', project_id)
if len(edge_table_na) > 0 or len(edge_table_list_rowcount_zero) > 0:
audit_event_source_table(source_object_id, source_object_name, source_object_type, source_name,
source_object_name,
job_id, AUDIT_JOB_STATUS_PASS, source_object_start_timestamp,
datetime.now(), 'NA', 'NA', project_id)
sys.exit(1)
except Exception as e:
msg = '{} : Issue with the edge validation for {} is: \n{}\n'.format(datetime.now(), edge_table, e)
logging.error(msg)
audit_event_source_table(source_object_id, source_object_name, source_object_type, source_name,
source_object_name,
job_id, AUDIT_JOB_STATUS_FAIL, source_object_start_timestamp,
datetime.now(), AUDIT_ERROR_TYPE_PREPROCESSING_KG_LAYER_VALIDATION, msg,
project_id)
raise Exception(msg)
Patch global variables - in the same way that you patch a method of a class, you patch the global variable. It's not clear in your code snippet where the global variables are defined (ie. do you import these variables from another module or do you assign to those variables at the top of your Python script). Either way, you patch in the namespace where the function is being used. If you can confirm further details I will be able to assist.
Personally, the way I patch and test functions with no return value is the same. For example, if I wanted to patch the source_object_start_timestamp variable, I would use: source_object_start_timestamp = patch('pandas.datetime.utcnow', return_value="2020-08-16 20:36:06.578174").start(). For BigQuery functions, I would still patch them but in your unit test, use the mock_call_count method of the unittest.mock.mock class to test if that function has been called.
Point 2 addresses your third query - use the mock_call_count method to check how many times the mock has been called

How to debug FactoInvestigate error: cannot open the connection

Anyone faced this error:
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
I was trying to investigate the MCA results using:
Investigate(MCA_res, file = "MCA.Rmd", document = c("word_document", "pdf_document"))
> Investigate(MCA_res, file = "MCA.Rmd", document = c("word_document", "pdf_document"))
this hits an error after the following steps:
-- creation of the .Rmd file (time spent : 0s) --
-- detection of outliers (time spent : 0s) --
0 outlier(s) terminated
-- analysis of the inertia (time spent : 0.08s) --
12 component(s) carrying information : total inertia of 54.3%
-- components description (time spent : 6.25s) --
plane 1:2
plane 3:4
plane 5:6
plane 7:8
plane 9:10
plane 11:12
-- classification (time spent : 7.04s) --
3 clusters
-- annexes writing (time spent : 7.24s) --
-- saving data (time spent : 8.83s) --
-- outputs compilation (time spent : 8.83s) --
Quitting from lines 13-15 (MCA.Rmd)
**Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection**
In addition: Warning messages:
1: In if (document == "Word" | document == "word" | document == "doc" | :
the condition has length > 1 and only the first element will be used
2: In if (document == "html" | document == "HTML" | document == "HTML_document") { :
the condition has length > 1 and only the first element will be used
3: In if (document == "pdf" | document == "PDF") { :
the condition has length > 1 and only the first element will be used
4: In if (document == "word_document") { :
the condition has length > 1 and only the first element will be used
Any help understanding this will be highly appreciated.
Tx
Rj
The problem is with the adress of "Workspace.RData" who need to be loaded. I resolve by changing the source: "R/Investigate.R" line 74, writing:
writeRmd(paste0("library(FactoMineR)\nload('", getwd(),"/Workspace.RData')"), file = file, start = TRUE, stop = TRUE, options = "r, echo = FALSE")
on place of
writeRmd"library(FactoMineR)\nload('~/Workspace.RData')"), file = file, start = TRUE, stop = TRUE, options = "r, echo = FALSE")

Application Engine Peoplecode bind variables

I have the below PeopleCode step in an Application Engine program that reads a CSV file using a File Layout and then inserts the data into a table, and I am just trying to get a better understanding of how the the line of code (&SQL1 = CreateSQL("%Insert(:1)");) in the below script gets generated. It looks like the CreateSQL is using a bind variable (:1) inside the Insert statement, but I am struggling as where to find where this variable is defined in the program.
Function EditRecord(&REC As Record) Returns boolean;
Local integer &E;
&REC.ExecuteEdits(%Edit_Required + %Edit_DateRange + %Edit_YesNo + %Edit_OneZero);
If &REC.IsEditError Then
For &E = 1 To &REC.FieldCount
&MYFIELD = &REC.GetField(&E);
If &MYFIELD.EditError Then
&MSGNUM = &MYFIELD.MessageNumber;
&MSGSET = &MYFIELD.MessageSetNumber;
&LOGFILE.WriteLine("****Record:" | &REC.Name | ", Field:" | &MYFIELD.Name);
&LOGFILE.WriteLine("****" | MsgGet(&MSGSET, &MSGNUM, ""));
End-If;
End-For;
Return False;
Else
Return True;
End-If;
End-Function;
Function ImportSegment(&RS2 As Rowset, &RSParent As Rowset)
Local Rowset &RS1, &RSP;
Local string &RecordName;
Local Record &REC2, &RECP;
Local SQL &SQL1;
Local integer &I, &L;
&SQL1 = CreateSQL("%Insert(:1)");
rem &SQL1 = CreateSQL("%Insert(:1) Order by COUNT_ORDER");
&RecordName = "RECORD." | &RS2.DBRecordName;
&REC2 = CreateRecord(#(&RecordName));
&RECP = &RSParent(1).GetRecord(#(&RecordName));
For &I = 1 To &RS2.ActiveRowCount
&RS2(&I).GetRecord(1).CopyFieldsTo(&REC2);
If (EditRecord(&REC2)) Then
&SQL1.Execute(&REC2);
&RS2(&I).GetRecord(1).CopyFieldsTo(&RECP);
For &L = 1 To &RS2.GetRow(&I).ChildCount
&RS1 = &RS2.GetRow(&I).GetRowset(&L);
If (&RS1 <> Null) Then
&RSP = &RSParent.GetRow(1).GetRowset(&L);
ImportSegment(&RS1, &RSP);
End-If;
End-For;
If &RSParent.ActiveRowCount > 0 Then
&RSParent.DeleteRow(1);
End-If;
Else
&LOGFILE.WriteRowset(&RS);
&LOGFILE.WriteLine("****Correct error in this record and delete all error messages");
&LOGFILE.WriteRecord(&REC2);
For &L = 1 To &RS2.GetRow(&I).ChildCount
&RS1 = &RS2.GetRow(&I).GetRowset(&L);
If (&RS1 <> Null) Then
&LOGFILE.WriteRowset(&RS1);
End-If;
End-For;
End-If;
End-For;
End-Function;
rem *****************************************************************;
rem * PeopleCode to Import Data *;
rem *****************************************************************;
Local File &FILE1, &FILE3;
Local Record &REC1;
Local SQL &SQL1;
Local Rowset &RS1, &RS2;
Local integer &M;
&FILE1 = GetFile("\\nt115\apps\interface_prod\interface_in\Item_Loader\ItemPriceFile.csv", "r", "a", %FilePath_Absolute);
&LOGFILE = GetFile("\\nt115\apps\interface_prod\interface_in\Item_Loader\ItemPriceFile.txt", "r", "a", %FilePath_Absolute);
&FILE1.SetFileLayout(FileLayout.GH_ITM_PR_UPDT);
&LOGFILE.SetFileLayout(FileLayout.GH_ITM_PR_UPDT);
&RS1 = &FILE1.CreateRowset();
&RS = CreateRowset(Record.GH_ITM_PR_UPDT);
REM &SQL1 = CreateSQL("%Insert(:1)");
&SQL1 = CreateSQL("%Insert(:1)");
/*Skip Header Row: The following line of code reads the first line in the file layout (the header)
and does nothing. Then the pointer goes to the next line in the file and starts using the
file.readrowset*/
&some_boolean = &FILE1.ReadLine(&string);
&RS1 = &FILE1.ReadRowset();
While &RS1 <> Null
ImportSegment(&RS1, &RS);
&RS1 = &FILE1.ReadRowset();
End-While;
&FILE1.Close();
&LOGFILE.Close();
The :1 is coming from the line further down &SQL1.Execute(&REC2);
&REC2 gets assigned a record object, so the line &SQL1.Execute(&REC2); evaluates to %Insert(your_record_object)
Here is a simple example that's doing basically the same thing
Here is a description of %Insert
Answer because too long to comment:
The table name is most likely (PS_)GH_ITM_PR_UPDT. The general consensus is to name the FileLayout the same as the record it is based on.
If not, it is defined in FileLayout.GH_ITM_PR_UPDT. Open the FileLayout, right click the segment and under 'Selected Node Properties' you will find the 'File Record Name'.
In your code this record is carried over into &RS1.
&FILE1.SetFileLayout(FileLayout.GH_ITM_PR_UPDT);
&RS1 = &FILE1.CreateRowset();
The rowset is a collection of rows. A row consists of records and a record is a row of data from a database table. (Peoplesoft Object Data Types are fun...)
This rowset is filled with data in the following statement:
&RS1 = &FILE1.ReadRowset();
This uses your file as input and outputs a rowset collection, mapping the data to records based on how you defined your FileLayout.
The result is fed into the ImportSegment function:
ImportSegment(&RS1, &RS);
Function ImportSegment(&RS2 As Rowset, &RSParent As Rowset)
&RS2 in the function is a reference to &RS1 in the rest of your code.
The table name is also hidden here:
&RecordName = "RECORD." | &RS2.DBRecordName;
So if you can't/don't want to check the FileLayout, you could output &RS2.DBRecordName with a messagebox and your answer will be Message Log of your Process Monitor.
Finally a record object is created for this database table and it is filled with a row from the rowset. This record is inserted into the database table:
&REC2 = CreateRecord(#(&RecordName));
&RS2(&I).GetRecord(1).CopyFieldsTo(&REC2);
&SQL1 = CreateSQL("%Insert(:1)");
&SQL1.Execute(&REC2);
TLDR:
Table name can be found in the FileLayout or output in the ImportSegment Function as &RS2.DBRecordName

Running embedded R code in Oracle raise error

I am a newbie in Oracle R embedded execution.
well, I have following code registered as
BEGIN
sys.rqScriptDrop('TSFORECAST');
SYS.RQSCRIPTCREATE('TSFORECAST',
'function(dat){
require(ORE)
require(forecast)
myts <- ts(dat,frequency=12)
model <- auto.arima(myts)
fmodel <- forecast(model)
fm = data.frame(fmodel$mean, fmodel$upper,fmodel$lower)
names(fm) <- c("mean","l80","l95","u80","u95")
return(fm)
}'
);
END;
as I execute the function for the first time with this code:
select *
from table(
rqTableEval(
cursor(select balance from tmp_30),
cursor(select 1 as "ore.connect" from dual),
'select 1 mean, 1 l80, 1 l95, 1 u80, 1 u95 from dual',
'TSFORECAST'
)
)
it generates the results I expected. But after that it will never produce any result but instead it raises this error:
ORA-20000: RQuery error
Error in (function () :
unused arguments (width = 480, bg = "white", type = "raster")
ORA-06512: at "RQSYS.RQTABLEEVALIMPL", line 112
ORA-06512: at "RQSYS.RQTABLEEVALIMPL", line 109
20000. 00000 - "%s"
*Cause: The stored procedure 'raise_application_error'
was called which causes this error to be generated.
*Action: Correct the problem as described in the error message or contact
the application administrator or DBA for more information.
I have searched this error but could not find anything helpful. Can anyone help me with this error?

Resources