I have a table in kusto with 13,000 rows. I would like to know how can I create a new column in this table which fill it with only 2 values (0 and 1) randomly. Is there also a possibility to create a column containing 3 different value of data type: string ?
you can extend a calculated column using the rand() function: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/randfunction
for example:
0 or 1:
| extend y = toint(rand(1) > 0.5)
1 of 3 strings (first, second or third):
| extend r = rand(3)
| extend s = case(r <= 0, "first", r <= 1, "second", "third")
| project-away r
if you need to do this at ingestion time, you can use an update policy: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/management/updatepolicy
or if you want to do this for the existing table, you can use a .set-or-replace command: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/management/data-ingestion/ingest-from-query
Related
Lets say I have a query like:
cluster("cluster1").database("db2").Table3
| distinct * // distinct combinations of data
| take 5 // take 5
How do I save the values from a column in the results output to a pack_array variable.
I want to use this pack_array variable for follow on queries like:
cluster("cluster2").database("db3").Table1
| where ColumnofInterest in (pack_array_var from above)
| take 5 // take 5
Provide the "*" argument to the function and use the "let" statement. Here is an example:
let ValuesFromTheOtherCluster = cluster('cluster1').database('db2').Table3
| extend tempArray = pack_array(*)
| summarize filters = make_set(tempArray);
cluster('cluster2').database("db3").Table1
| where ColumnofInterest in (ValuesFromTheOtherCluster)
Using query_parameters, how can I:
specify a result column name (ex: summarize ResultColumnName = count())
specify the value of a bin, when value is actually the name of a column in the table
This is easiest to summarize with an example:
let myTable = datatable (Timestamp:datetime)
[datetime(1910-06-11),
datetime(1930-01-01),
datetime(1997-06-25),
datetime(1997-06-25)];
let UntrustedUserInput_ColumnName = "MyCount"; // actually from query_parameters
let UntrustedUserInput_BinValue = "Timestamp"; // actually from query_parameters
let UntrustedUserInput_BinRoundTo = "365d"; // actually from query_parameters
// the query I really want to perform
myTable
| summarize MyCount=count() by bin(todatetime(Timestamp), totimespan(365d));
// what the query looks like if I use query_parameters
myTable
| summarize UntrustedUserInput_ColumnName=count() by bin(todatetime(UntrustedUserInput_BinValue), totimespan(UntrustedUserInput_BinRoundTo));
Results:
Timestamp MyCount
--------- -------
1909-09-26T00:00:00Z 1
1929-09-21T00:00:00Z 1
1996-09-04T00:00:00Z 2
Column1 UntrustedUserInput_ColumnName
------- -----------------------------
4
I can't find a solution to #1.
It appears #2 can almost be solved by using column_ifexists, but I don't have a "default" to fall back on, I'd rather just fail if the column doesn't exist.
Treating column names as variables is not possible since columns names are part of the result schema coming out of each operator (with the exception of the "evaluate" operator, see specifically the pivot plugin).
There actually is a way to set variable names to a column, using a hacky trick:
let VariableColumnName = "TestColumn"; // the new column name that you want
range i from 1 to 5 step 1 // this is just a sample query
| project pack(VariableColumnName, i) // this created a JSON
| evaluate bag_unpack(Column1) // unpacking the JSON creates a column with a dynamic name
This will return a column named TestColumn, which is set in VariableColumnName.
Here is my table data from which I want to assign values to the record.
Member_ID | Claim_ID | Codes | Pull
123 | Y | 12,23,35,78 | Y
123 | N | 12,35 | Y
123 | N | 23,34 | N
123 | N | 33,34 | N
I am using the teradata to assign 'Y' or 'N' to Pull depending on the codes and claims.
SEL A.MEMBER_ID,A.CLAIM_ID,A.CODES,
'Y' AS PULL
FROM (SEL * FROM DBC.PULL_COMP WHERE CLAIM_ID='Y') A
INNER JOIN ((SEL * FROM DBC.PULL_COMP WHERE CLAIM_ID='N') B
ON A.MEMBER_ID=B.MEMBER_ID
UNION
SEL B.MEMBER_ID,B.CLAIM_ID,B.CODES,
CASE WHEN OREPLACE(A.CODES,B.CODES,B.CODES)=A.CODES THEN 'Y'
ELSE 'N' END AS PULL
FROM (SEL * FROM DBC.PULL_COMP WHERE CLAIM_ID='Y') A
INNER JOIN ((SEL * FROM DBC.PULL_COMP WHERE CLAIM_ID='N') B
ON A.MEMBER_ID=B.MEMBER_ID
If the Claim_id is 'Y' the Pull will remain 'Y'. I want to compare the records whose claim_id is 'Y' with those whose claim_id id 'N'. The second record contains no new numbers when comparing with 1st record so Pull='Y'. The 3rd record contains one new number(34) hence Pull='N'. The 4th record contains all new numbers compared to 1st record hence 'N'. Even if there is one new number then Pull='N'. If all the numbers(Codes) of Claim_id='N' matches with the Codes of Claim_id='Y' then only Pull='Y'. I am populating the Pull column looking at member_id, claim_id and codes.
I am getting not the desired result with above query.
I have a Kusto table counts with 4 rows and 3 columns that has the following elements
HasFailure FunnelPhase count_
0 Experienced 172425
0 NewSubs 25399
1 Experienced 3289
1 NewSubs 643
I would like to access the 3rd element in the 2nd column and save it to a scalar. I have tried the following code:
let value = counts | project count_ lookup 3;
But I am not able to obtain the desired result. What would be the correct way in which to obtain this value?
you'll need to order the records in your table (according to an order you define), then access the 3rd record (according to that same order), and finally - project the specific column you're interested in.
e.g.:
let T =
datatable(HasFailure:bool, FunnelPhase:string, count_:long)
[
0, 'Experienced', 172425,
0, 'NewSubs', 25399,
1, 'Experienced', 3289,
1, 'NewSubs', 643,
]
;
let 3rd_element_in_2nd_column = toscalar(
T
| order by count_ desc
| where row_number() == 3
| project FunnelPhase
)
;
print result = 3rd_element_in_2nd_column
Basically I'd like to pass in a set of field values to a function so I can use in/!in operators. I'd prefer to be able to use the result of a previous query rather than having to construct a set manually.
As in:
let today = exception | where EventInfo_Time > ago(1d) | project exceptionMessage;
MyAnalyzeFunction(today)
What is then the signature of MyAnalyzeFunction?
See: https://learn.microsoft.com/en-us/azure/kusto/query/functions/user-defined-functions
For instance, the following will return a table with a single column (y) with the values 2 and 3:
let someTable = range x from 2 to 10 step 1
;
let F = (T:(x:long))
{
range y from 1 to 3 step 1
| where y in (T)
}
;
F(someTable)