How can I create SQL output to a CSV file in Application Engine? - peoplesoft

I am new to Peopletools and Peoplecode.
I have a SQL statement i want to gather the output (all data from the SQL) and create a CSV file.
Can you provide a sample CSV file code?
DO i need to create the file layout for the App engine process?enter image description here

You don't necessarily need a file layout to create a csv - you just have to be careful with the data you're putting into the CSV as any columns with a comma in the data would start a new column etc.
&filename = "YourFilePath";
&newFile = GetFile(&filename, "w", "a", %FilePath_Absolute);
&newFile.WriteLine("id" | "," | "type" | "," | "streetAddress" | "," | "addressLine2" | "," | "addressLine3" | "," | "locality" | "," | "postalCode" | "," | "region" | "," | "country" | "," | "source" | "," | "requester");
&rsActiveEmp = CreateRowset(Record.RECORD_TO_PROCESS);
&rsActiveEmp.Fill("where processed = 'N'");
For &i = 1 To &rsActiveEmp.ActiveRowCount
&emplid = &rsActiveEmp.GetRow(&i).RECORD_TO_PROCESS.EMPLID.Value;
Local TST_PERSON:PERSON &person = create TST_PERSON:PERSON(&emplid);
Local array of string &address = &person.getMaxEdAddress();
&newFile.WriteLine(&person.id_ | "," | "Mailing Address" | "," | &address [1] | "," | &address [2] | "," | "" | "," | &address [4] | "," | &address [5] | "," | &address [6] | "," | &address [7] | "," | "1111111" | "," | "FictitiousPerson");
End-For;
&newFile.Close();
You'll need to manually write the header row and then loop through your dataset to write the rest of the rows. In my example the TST_PERSON is a custom app class I use to create "person" objects for commonly used bio-demo data retrieval. In your case you could just use a SqlExec statement to gather that data for each row.

Related

split string column value into multiple rows in kusto

I have a string column which includes "," delimiter, I want to split this column into multiple rows.
Here's the table
|Token |Shop|
|a |P |
|A10,A9a,C1a,F1 |R |
Expected Output:
|Token |Shop|
|a |P |
|A10 |R |
|A9a |R |
|C1a |R |
|F1 |R |
I tried below logic using mv-expand but it doesn't seem to work
datatable(Tokens:string, Shop:string)["a", "P",
"A10,A9a,C1a,F1", "R" ]
| mv-expand Token =todynamic(Tokens) to typeof(string)
You can use split() before mv-expand:
datatable(Tokens:string, Shop:string)["a","P","A10,A9a,C1a,F1","R" ]
| mv-expand token = split(Tokens, ",") to typeof(string)

List all strings appearing more than once in a file

I have a very large file (around 70GB), and I want to list all strings that appear more than once in the whole file.
I can list all the matches when I specify which string to search in a file, but I want to list all strings that have more than one occurrence.
For example, assuming my file looks like this:
+------+------------------------------------------------------------------+----------------------------------+--+
| HHID | VAL_CD64 | VAL_CD32 | |
+------+------------------------------------------------------------------+----------------------------------+--+
| 203 | 8c5bfd9b6755ffcdb85dc52a701120e0876640b69b2df0a314dc9e7c2f8f58a5 | 373aeda34c0b4ab91a02ecf55af58e15 | |
| 7AB | f6c581becbac4ec1291dc4b9ce566334b1cb2c85e234e489e7fd5e1393bd8751 | 2c4f97a04f02db5a36a85f48dab39b5b | |
| 7AB | abad845107a699f5f99575f8ed43e0440d87a8fc7229c1a1db67793561f0f1c3 | 2111293e946703652070968b224875c9 | |
| 348 | 25c7cf022e6651394fa5876814a05b8e593d8c7f29846117b8718c3dd951e496 | 5c80a555fcda02d028fc60afa29c4a40 | |
| 348 | 67d9c0a4bb98900809bcfab1f50bef72b30886a7b48ff0e9eccf951ef06542f9 | 6c10cd11b805fa57d2ca36df91654576 | |
| 348 | 05f1e412e7765c4b54a9acfd70741af545564f6fdfe48b073bfd3114640f5e37 | 6040b29107adf1a41c4f5964e0ff6dcb | |
| 4D3 | 3e8da3d63c51434bcd368d6829c7cee490170afc32b5137be8e93e7d02315636 | 71a91c4768bd314f3c9dc74e9c7937e8 | |
+------+------------------------------------------------------------------+----------------------------------+--+
And I want to list only records which have HHID more than once, i.e, 7AB and 348.
Any idea how can I implement this?
awk to the rescue:
awk -F'[ |]+' '
$2 ~ /^[[:alnum:]]+$/ { count[$2]++ }
END {
for (hhid in count) {
if (count[hhid] >= 2) {
print hhid
}
}
}
' file
-F'[ |]+' sets the field separator.
$2 ~ /^[[:alnum:]]+$/ filters out the header and horizontal lines.
count[$2]++ increases the value at $2, the string we’re counting. On the first occurrence this initialises the value to 1. On the second occurrence it increases it to 2, and so on.
END is run after all lines have been processed.
for (hhid in count) iterates over the strings in count.
if (count[hhid] >= 2) skips any <2 counts.
print hhid prints the string.

Split column string with delimiters into separate columns in azure kusto

I have a column 'Apples' in azure table that has this string: "Colour:red,Size:small".
Current situation:
|-----------------------|
| Apples |
|-----------------------|
| Colour:red,Size:small |
|-----------------------|
Desired Situation:
|----------------|
| Colour | Size |
|----------------|
| Red | small |
|----------------|
Please help
I'll answer the title as I noticed many people searched for a solution.
The key here is mv-expand operator (expands multi-value dynamic arrays or property bags into multiple records):
datatable (str:string)["aaa,bbb,ccc", "ddd,eee,fff"]
| project splitted=split(str, ',')
| mv-expand col1=splitted[0], col2=splitted[1], col3=splitted[2]
| project-away splitted
project-away operator allows us to select what columns from the input exclude from the output.
Result:
+--------------------+
| col1 | col2 | col3 |
+--------------------+
| aaa | bbb | ccc |
| ddd | eee | fff |
+--------------------+
This query gave me the desired results:
| parse Apples with "Colour:" AppColour ", Size:" AppSize
Remember to include all the different delimiters preceding each word you want to extract, e.g ", Size". Mind the space between.
This helped me then i used my intuition to customize the query according to my needs:
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/parseoperator

Parse data in Kusto

I am trying to parse the below data in Kusto. Need help.
[[ObjectCount][LinkCount][DurationInUs]]
[ChangeEnumeration][[88][9][346194]]
[ModifyTargetInLive][[3][6][595903]]
Need generic implementation without any hardcoding.
ideally - you'd be able to change the component that produces source data in that format to use a standard format (e.g. CSV, Json, etc.) instead.
The following could work, but you should consider it very inefficient
let T = datatable(s:string)
[
'[[ObjectCount][LinkCount][DurationInUs]]',
'[ChangeEnumeration][[88][9][346194]]',
'[ModifyTargetInLive][[3][6][595903]]',
];
let keys = toscalar(
T
| where s startswith "[["
| take 1
| project extract_all(#'\[([^\[\]]+)\]', s)
);
T
| where s !startswith "[["
| project values = extract_all(#'\[([^\[\]]+)\]', s)
| mv-apply with_itemindex = i keys on (
extend Category = tostring(values[0]), p = pack(tostring(keys[i]), values[i + 1])
| summarize b = make_bag(p) by Category
)
| project-away values
| evaluate bag_unpack(b)
--->
| Category | ObjectCount | LinkCount | DurationInUs |
|--------------------|-------------|-----------|--------------|
| ChangeEnumeration | 88 | 9 | 346194 |
| ModifyTargetInLive | 3 | 6 | 595903 |

What is the meaning of SP and HT in separators defention

In the the HTTP headers RFC I need to understand the definition of token:
token = 1*
separators = "(" | ")" | "<" | ">" | "#"
| "," | ";" | ":" | "\" | <">
| "/" | "[" | "]" | "?" | "="
| "{" | "}" | SP | HT
I do not get what is the meaning of SP and HT at the end of the separators list? How to write this in a regex?
Both are defined in the very same RFC:
SP = <US-ASCII SP, space (32)>
HT = <US-ASCII HT, horizontal-tab (9)>

Resources