How do I split this array of strings into table rows in U-SQL? - u-sql

I was trying this snippet to split my json array.
activity =
//to extract json object required, only "activity" field is to be parsed and not "nameOfWebsite"
EXTRACT activities : string
FROM #input
USING Extract.Json(rowPath: "[]");
activities_arr =
SELECT
//splitting into array based on delimiter
new ARRAY<string>(activities.Split(',')) AS activities
FROM activity
;
activities_output =
SELECT activities
FROM activities_arr AS ac
CROSS APPLY EXPLODE(ac.activities) AS activities //to split above array into rows
;
Input is like this
[
{
"nameOfWebsite": "StackOverflow", // this object is not required
"activities": [
"Python",
"U-SQL",
"JavaScript"
]
}
]
So, currently I am getting output as: 5 columns with one column as some random string not in input followed by 3 blank columns and then the 5th column contains Python, U-SQL, JavaScript in separate rows.
Questions:
Is there any way to avoid the 4 other columns as I only require data 4th column ie. only the name of activities?
Why are there blank spaces in my current output when my delimiter is defined as ','?
Current output ("blank" denotes blank space and not string blank)
AB#### "blank" "blank" "blank" Python
AB#### "blank" "blank" "blank" U-SQL
AB#### "blank" "blank" "blank" JavaScript
Output expected
Python
U-SQL
JavaScript

Related

How can I List unique characters in a dictionary and store them as a set?

I am trying to list unique characters in a dictionary and store them as a set. The dictionary has the following fields
ID,Name, Description, Type Price.
I need to list the unique categories in "Type" field
content=("C:\\Users\\jon.welsh\\Desktop\\ebyayproducts.json", "r")
for item in ebayproducts:
values = set([i['Type'] for i in content])
# and then I get this Error
> TypeError: string indices must be integers
Based on your example, you don't open the file - you just create a tuple content that contains 2 string values.
To open a file, parse the json you can do:
import json
with open("C:\\Users\\jon.welsh\\Desktop\\ebyayproducts.json", "r") as f_in:
content = json.load(f_in)
values = set(i["Type"] for i in content)
print(values)

Extract the numeric value from string in Kusto

This is my datatable:
datatable(Id:dynamic)
[
dynamic([987654321][Just Kusto Things]),
]
and I've extracted 1 field from a json using
| project ID=parse_json(Data).["CustomValue"]
And the result is something like - [987654321][Just Kusto Things]. I wanted to extract the numbered value(987654321) within the 1st square brackets. How to best retrieve that value? Using split/parse/extract?
the datatable in the sample is not valid. If the values are just an array then you can get the results by using the array position like this:
datatable(Id:dynamic)
[
dynamic([987654321,"Just Kusto Things"]),
]
| extend Id = Id[0]
If it is something else, please provide a valid datatable with an example that is representative of the real data.
the result is something like - [987654321][Just Kusto Things]. I wanted to extract the numbered value(987654321) within the 1st square brackets. How to best retrieve that value?
you can use the parse operator
For example:
print input = '[987654321][Just Kusto Things]'
| parse input with '[' output:long ']' *

How to delete DDIC table records which have different id than row number in internal table?

I have an ALV with two rows. I want to delete these rows in internal table and dictionary table also. To get which rows in alv i chose, i use a method
go_selections = go_salv->get_selections( ).
go_rows = go_selections->get_selected_rows( )
Nextly, i am iterating through results LOOP AT go_rows INTO gv_row.
Inside above loop I have an another loop, which stores data from internal table into workarea. Then, i set the counter variable which holds the id of the dictionary table and delete respective row.
LOOP AT gr_data INTO lr_znewfdkey6.
counter2 = lr_znewfdkey6-id.
IF counter2 EQ gv_row.
DELETE FROM znew_fdkey01 WHERE id EQ lr_znewfdkey6-id.
MESSAGE 'Row deleted .' TYPE 'I'.
But unfortunately this works only when id of the dictionary table is equal to row number selected in alv. If I have lr_znewfdkey6-id in dictionary table, equal to for example 5, get_selected_rows( ) returns value started by one etc., and this will cause inequality.
How to fix this?
Get selected rows returns a table of line numbers.
lt_rows = lo_selections->get_selected_rows( ).
Those numbers correspond directly to the itab you loaded into the ALV. No matter if it has been sorted or filtered. It does not correspond to any fields in the database like an ID field or anything.
Assuming gr_datais the itab assigned to the ALV. Let's loop lt_rows and read gr_data at index
LOOP AT lt_rows ASSIGNING FIELD-SYMBOL(<row>).
READ TABLE gr_data INTO ls_data INDEX <row>.
IF sy-subrc = 0.
APPEND ls_data TO lt_selected.
ENDIF.
ENDLOOP.
After executing this will collect selected gr_data lines into lt_selected itab. To delete
LOOP AT lt_selected ASSIGNING FIELD-SYMBOL(<row>).
DELETE TABLE gr_data FROM <row>.
ENDLOOP.
You could also simply do:
LOOP AT lt_rows ASSIGNING FIELD-SYMBOL(<row>).
DELETE gr_data INDEX <row>.
ENDLOOP.
After that refresh your ALV. Should be good.

Replacing empty string column with null in Kusto

How do I replace empty (non null) column of string datatype with null value?
So say the following query returns non zero recordset:-
mytable | where mycol == ""
Now these are the rows with mycol containing empty strings. I want to replace these with nulls. Now, from what I have read in the kusto documentation we have datatype specific null literals such as int(null),datetime(null),guid(null) etc. But there is no string(null). The closest to string is guid, but when I use it in the following manner, I get an error:-
mytable | where mycol == "" | extend test = translate(mycol,guid(null))
The error:-
translate(): argument #0 must be string literal
So what is the way out then?
Update:-
datatable(n:int,s:string)
[
10,"hello",
10,"",
11,"world",
11,"",
12,""
]
| summarize myset=make_set(s) by n
If you execute this, you can see that empty strings are being considered as part of sets. I don't want this, no such empty strings should be part of my array. But at the same time I don't want to lose value of n, and this is exactly what will happen if I if I use isnotempty function. So in the following example, you can see that the row where n=12 is not returned, there is no need to skip n=12, one could always get an empty array:-
datatable(n:int,s:string)
[
10,"hello",
10,"",
11,"world",
11,"",
12,""
]
| where isnotempty(s)
| summarize myset=make_set(s) by n
There's currently no support for null values for the string datatype: https://learn.microsoft.com/en-us/azure/kusto/query/scalar-data-types/null-values
I'm pretty certain that in itself, that shouldn't block you from reaching your end goal, but that goal isn't currently clear.
[update based on your update:]
datatable(n:int,s:string)
[
10,"hello",
10,"",
11,"world",
11,"",
12,""
]
| summarize make_set(todynamic(s)) by n

Splitting Columns in USQL

I am new to USQL and I am having a hard time splitting a column from the rest of my file. With my EXTRACTOR I declared 4 columns because my file is split into 4 pipes. However, I want to remove one of the columns I declared from the file. How do I do this?
The Json column of my file is what I want to split off and make you new object that does not include it. Basically splitting Date, Status, PriceNotification into the #result. This is what I have so far:
#input =
EXTRACT
Date string,
Condition string,
Price string,
Json string
FROM #in
USING Extractor.Cvs;
#result =
SELECT Json
FROM #input
OUTPUT #input
TO #out
USING Outputters.Cvs();
Maybe I have misunderstood your question, but you can simply list the columns you want in the SELECT statement, eg
#input =
EXTRACT
Date string,
Status string,
PriceNotification string,
Json string
FROM #in
USING Extractor.Text('|');
#result =
SELECT Date, Status, PriceNotification
FROM #input;
OUTPUT #result
TO #out
USING Outputters.Cvs();
NB I have switched the variable in your OUTPUT statement to be #result. If this does not answer your question, please post some sample data and expected results.

Resources