Extract the numeric value from string in Kusto - azure-data-explorer

This is my datatable:
datatable(Id:dynamic)
[
dynamic([987654321][Just Kusto Things]),
]
and I've extracted 1 field from a json using
| project ID=parse_json(Data).["CustomValue"]
And the result is something like - [987654321][Just Kusto Things]. I wanted to extract the numbered value(987654321) within the 1st square brackets. How to best retrieve that value? Using split/parse/extract?

the datatable in the sample is not valid. If the values are just an array then you can get the results by using the array position like this:
datatable(Id:dynamic)
[
dynamic([987654321,"Just Kusto Things"]),
]
| extend Id = Id[0]
If it is something else, please provide a valid datatable with an example that is representative of the real data.

the result is something like - [987654321][Just Kusto Things]. I wanted to extract the numbered value(987654321) within the 1st square brackets. How to best retrieve that value?
you can use the parse operator
For example:
print input = '[987654321][Just Kusto Things]'
| parse input with '[' output:long ']' *

Related

Find all records by the value of the json key in MariaDB 10.1

I have MariaDB 10.1. - I can't use JSON functions - JSON_EXTRACT etc.).
In the database I have a table CONTRACTS and a column data, which contains JSON (data type TEXT):
{"879": "Test", "880": "15255", "881": "2021-10-22"}
And I need to find all records that have a key value of "880" in some range, eg greater than 10000 and less than 20000, ie. in this case, a record with a value of 15255.
Thanks for advice.
Maybe something like this:
SELECT
TRIM(BOTH '"' FROM
REGEXP_SUBSTR(REGEXP_SUBSTR(CONTRACTS.`data`, '"880": "[0-9]+"'), '"[0-9]+"$')
) * 1 BETWEEN 10000 AND 20000
FROM
(SELECT
'{"879": "Test", "880": "15255", "881": "2021-10-22"}' AS `data`
) AS CONTRACTS
So the most internal regexp gives you the key + value. The outer regexp takes that result and extracts the value in quotes. Trim the quotes and test the value. You could use the entire TRIM(...) as a criterium .

Kusto sub query selection using toscalar - returns only last matching record

I am referring sqlcheatsheet - Nested queries
Query 1:
traces
| where customDimensions.Domain == "someDomain"
| where message contains "some-text"
| project itemId=substring(itemId,indexof(itemId,"-"),strlen(itemId))
Result :
itemId
-c580-11e9-888a-8776d3f65945
-c580-11e9-888a-8776d3f65945
-c580-11e9-9b01-c3be0f4a2bf2
Query 2:
traces
| where customDimensions.Domain == "someDomain"
| where itemId has toscalar(
traces
| where customDimensions.Domain == "someDomain"
| where message contains "some-text"
| project itemId=substring(itemId,indexof(itemId,"-"),strlen(itemId)))
Result for the second query returns records matching only last record of sub query
ie:) > -c580-11e9-9b01-c3be0f4a2bf2
Question :
How get entire result set that has matching with all the three items.
My requirement is to take entire sequence of logs for a particular request.
To get that I have below inputs, I could able to take one log, from that I can find ItemId
The itemId looks like "b5066283-c7ea-11e9-9e9b-2ff40863cba4". Rest of all logs related to this request must have "-c7ea-11e9-9e9b-2ff40863cba4" this value. Only first part will get incremented like b5066284 , b5066285, b5066286 like that.
toscalar(), as its name implies, returns a scalar value.
Given a tabular argument with N columns and M rows it'll return the value in the 1st column and the 1st row.
For example: the following will return a single value - 1
let T = datatable(a:int, b:int, c:int)
[
1,2,3,
4,5,6,
7,8,9,
]
;
print toscalar(T)
If I understand the intention in your 2nd query correctly, you should be able to achieve your requirement by using has_any.
For example:
let T = datatable(item_id:string)
[
"c580-11e9-888a-8776d3f65945",
"c580-11e9-888a-8776d3f65945",
"c580-11e9-9b01-c3be0f4a2bf2",
]
;
T
| where item_id has_any (
(
T
| parse item_id with * "-" item_id
)
)

Replacing empty string column with null in Kusto

How do I replace empty (non null) column of string datatype with null value?
So say the following query returns non zero recordset:-
mytable | where mycol == ""
Now these are the rows with mycol containing empty strings. I want to replace these with nulls. Now, from what I have read in the kusto documentation we have datatype specific null literals such as int(null),datetime(null),guid(null) etc. But there is no string(null). The closest to string is guid, but when I use it in the following manner, I get an error:-
mytable | where mycol == "" | extend test = translate(mycol,guid(null))
The error:-
translate(): argument #0 must be string literal
So what is the way out then?
Update:-
datatable(n:int,s:string)
[
10,"hello",
10,"",
11,"world",
11,"",
12,""
]
| summarize myset=make_set(s) by n
If you execute this, you can see that empty strings are being considered as part of sets. I don't want this, no such empty strings should be part of my array. But at the same time I don't want to lose value of n, and this is exactly what will happen if I if I use isnotempty function. So in the following example, you can see that the row where n=12 is not returned, there is no need to skip n=12, one could always get an empty array:-
datatable(n:int,s:string)
[
10,"hello",
10,"",
11,"world",
11,"",
12,""
]
| where isnotempty(s)
| summarize myset=make_set(s) by n
There's currently no support for null values for the string datatype: https://learn.microsoft.com/en-us/azure/kusto/query/scalar-data-types/null-values
I'm pretty certain that in itself, that shouldn't block you from reaching your end goal, but that goal isn't currently clear.
[update based on your update:]
datatable(n:int,s:string)
[
10,"hello",
10,"",
11,"world",
11,"",
12,""
]
| summarize make_set(todynamic(s)) by n

REGEXP_SUBSTR return all matches (mariaDB)

I need to get all the matches of a regular expression in a text field in a MariaDB table. As far as I know REGEXP_SUBSTR is the way to go to get the value of the match of a regular expression in a text field, but it always returns after the first match and I would like to get all matches.
Is there any way to do this in MariaDB?
An example of the content of the text field would be:
#Generation {
// 1
True =>
`CP?:24658` <= `CPV?:24658=57186`;
//`CP?23432:24658` <= `CPV?:24658=57186`
// 2
`CP?:24658` <> `CPV?:24658=57178` =>
`CP?:24656` <> `CPV?:24656=57169`;
And the select expression that I'm using right now is:
select REGEXP_SUBSTR(textfield,'CP\\?(?:\\d*:)*24658') as my_match
from table
where id = 1243;
Which at the moment returns just the first match:
CP?:24658
And I would like it to return all matches:
CP?:24658
CP?23432:24658
CP?:24658
Use just REGEXP to find the interesting rows. Put those into a temp table
Repeatedly process the temp table -- but remove the SUBSTR as you use it.
What will you be doing with each substr? Maybe that will help us devise a better approach.

Splitting Columns in USQL

I am new to USQL and I am having a hard time splitting a column from the rest of my file. With my EXTRACTOR I declared 4 columns because my file is split into 4 pipes. However, I want to remove one of the columns I declared from the file. How do I do this?
The Json column of my file is what I want to split off and make you new object that does not include it. Basically splitting Date, Status, PriceNotification into the #result. This is what I have so far:
#input =
EXTRACT
Date string,
Condition string,
Price string,
Json string
FROM #in
USING Extractor.Cvs;
#result =
SELECT Json
FROM #input
OUTPUT #input
TO #out
USING Outputters.Cvs();
Maybe I have misunderstood your question, but you can simply list the columns you want in the SELECT statement, eg
#input =
EXTRACT
Date string,
Status string,
PriceNotification string,
Json string
FROM #in
USING Extractor.Text('|');
#result =
SELECT Date, Status, PriceNotification
FROM #input;
OUTPUT #result
TO #out
USING Outputters.Cvs();
NB I have switched the variable in your OUTPUT statement to be #result. If this does not answer your question, please post some sample data and expected results.

Resources