How to handle inner Json when using JsonOutputter - azure-cosmosdb

I'm converting some csv files into Json using the JsonOutputter. In the csv files I have a field containing Json like this (pipe character is delimiter):
...|{ "type":"Point", "coordinates":[ 18.7726, 74.5091 ] }|...
When it's output to Json, the result looks like this:
"Location": "{ \"type\":\"Point\", \"coordinates\":[ 18.7726, 74.5091 ] }"
I would like to get rid of the outer quotes to make the Json look like this:
"Location": { "type":"Point", "coordinates":[ 18.7726, 74.5091 ] }
What is the best way to accomplish this? The output Json will be stored in Cosmos DB, so I guess the "cleaning up" of the Json could be done either in U-SQL or in Cosmos DB?

The sample outputter is only generating flat JSON. Since we do not have a JSON datatype, any string value has to be escaped to be a string value.
You can write your own custom Outputter that for example takes SqlMap instances for nested values and output them as nested JSON, or - if you know that some strings in the rowsets are really JSON and not just strings, serialize them without the quotes.

If JsonOutputter is not the only choice to that
,we could covert csv file to Json with our custom code.
I test it with following csv file.
number|Location
1|{ "type":"Point", "coordinates":[ 13.7726, 73.5091 ] }
2|{ "type":"Point", "coordinates":[ 14.7726, 74.5091 ] }
Please have a try to use the following code, it works correctly on my side.
var lines = File.ReadAllText(#"C:\Tom\tomtest.csv").Replace("\r", "").Split('\n');
var csv = lines.Select(l => l.Split('|')).ToList();
var headers = csv[0];
var dicts = csv.Skip(1).Select(row => headers.Zip(row, Tuple.Create).ToDictionary(p => p.Item1, p => p.Item2)).ToArray().Select(x=>new
{
number = x["number"],
location = JObject.Parse(x["Location"])
});
string json = JsonConvert.SerializeObject(dicts);
Console.WriteLine(json);
Test result:

Related

r json mongodb query $in operator syntax error due to double quotes?

I'm building a json query to pass to a mongodb database in R.
In one scenario, I have a vector of dates and I want to query the database to return all records which have a date in the relevant field that matches a date in my vector of dates.
The second scenario is the same as the first, but this time I have a vector of character strings (IDs) and need to return all the records with matching IDs.
I understood the correct way to do this in a json query is to use the $in operator, and then put my vector in an array.
However, when I pass the query to my mongodb database, the exportLogId returns NULL. I'm quite sure that the problem is something to do with how I am representing the $in operator in the final query, since I have very similarly structured queries without the $in operator and they are all working. If I look for just one of my target dates or character strings, I get the desired result.
I followed the mongodb manual here to construct my query, and the only issue I can see is that the $in operator in the output of jsonlite::toJSON() is enclosed in double quotes; whereas I think it might need to be in single quotes (or no quotes at all, but I don't know how to write the syntax for that).
I'm creating my query in two steps:
Create the query as a series of nested lists
Convert the list object to json with jsonlite::toJSON()
Here is my code:
# Load libraries:
library(jsonlite)
# Create list of example dates to query in mongodb format:
sampledates <- c("2022-08-11T00:00:00.000Z",
"2022-08-15T00:00:00.000Z",
"2022-08-16T00:00:00.000Z",
"2022-08-17T00:00:00.000Z",
"2022-08-19T00:00:00.000Z")
# Create query as a list object:
query_list_l <- list(filter =
# Add where clause:
list(where =
# Filter results by list of sample dates:
list(dateSampleTaken = list('$in' = sampledates),
# Define format of column names and values:
useDbColumns = "true",
dontTranslateValues = "true",
jsonReplaceUndefinedWithNull = "true"),
# Define columns to return:
fields = c("id",
"updatedAt",
"person.visualId",
"labName",
"sampleIdentifier",
"dateSampleTaken",
"sequence.hasSequence")))
# Convert list object to JSON:
query_json = jsonlite::toJSON(x = query_list_l,
pretty = TRUE,
auto_unbox = TRUE)
The JSON query now looks like this:
> query_json
{
"filter": {
"where": {
"dateSampleTaken": {
"$in": ["2022-08-11T00:00:00.000Z", "2022-08-15T00:00:00.000Z", "2022-08-16T00:00:00.000Z", "2022-08-17T00:00:00.000Z", "2022-08-19T00:00:00.000Z"]
},
"useDbColumns": "true",
"dontTranslateValues": "true",
"jsonReplaceUndefinedWithNull": "true"
},
"fields": ["id", "updatedAt", "person.visualId", "labName", "sampleIdentifier", "dateSampleTaken", "sequence.hasSequence"]
}
}
As you can see, $in is now enclosed in double quotes, even though I put it in single quotes when I created the query as a list object. I have tried replacing with sprintf() but that just adds a lot of backslashes to my query. I also tried:
query_fixed <- gsub(pattern = "\\"\\$\\in\\"",
replacement = "\\'$in\\'",
x = query_json)
... but this fails with an error.
I would be very grateful to know if:
The syntax problem that is preventing $in from working is actually the double quotes?
If double quotes is the problem, how do I replace them with single quotes without messing up the JSON format?
UPDATE:
The issue seems to occur when R is passing the query to the database, but I still can't work out exactly why.
If I try the query out in loopback explorer in the database, it works and using the export log ID produced, I can then fetch the results with httr::GET() in R. Example query results are shown below (sorry for the hashes - the main point is you can see the format of the returned values):
[1] "[{\"_id\":\"e59953b6-a106-4b69-9e25-1c54eef5264a\",\"updatedAt\":\"2022-09-12T20:08:39.554Z\",\"dateSampleTaken\":\"2022-08-16T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0044-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0002\"}},{\"_id\":\"af5cd9cc-4813-4194-b60b-7d130bae47bc\",\"updatedAt\":\"2022-09-12T20:11:07.467Z\",\"dateSampleTaken\":\"2022-08-17T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0061-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0003\"}},{\"_id\":\"b5930079-8d57-43a8-85c0-c95f7e0338d9\",\"updatedAt\":\"2022-09-12T20:13:54.378Z\",\"dateSampleTaken\":\"2022-08-16T00:00:00.000Z\",\"labName\":\"LNG_REFERENCE_DATA_CATEGORY_LAB_NAME_LAB_A\",\"sampleIdentifier\":\"LS0043-SCV2-PCR\",\"sequence\":{\"hasSequence\":false},\"person\":{\"visualId\":\"C-2022-0004\"}}]"

Kusto extractjson not working with email address

I am attempting to use the extractjson() method that includes email addresses in the source data (specifically the # symbol).
let T = datatable(MyString:string)
[
'{"user#domain.com": {"value":10}, "userdomain.com": { "value": 5}}'
];
T
| project extractjson('$.["user#domain.com"].value', MyString)
This results in a null being returned, changing the JSONPath to '$.["userdomain.com"].value' does return the correct result.
Results
I know the # sign is a used as the current node in a filter expression, does this need to be escaped when used with KQL?
Just as a side note, I run the same test using nodes 'jsonpath' package and this worked as expected.
const jp = require('jsonpath');
const data = {"user#domain.com": {"value":10}, "name2": { "value": 5}};
console.log(jp.query(data, '$["user#domain.com"].score'));
you can use the parse_json() function instead, and when you don't have to use extract_json():
print MyString = '{"user#domain.com": {"value":10}, "userdomain.com": { "value": 5}}'
| project parse_json(MyString)["user#domain.com"].value
MyString_user#domain.com_value
10
From the documentation: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/extractjsonfunction

Application Insights and Azure Stream Analytics Query export the whole custom dimensions as string

I have setup a continuous export from Application Insights into Blog. With a data stream I'm able to get out the JSON files into SQL DB. So far so good.
Also with help from Phani Rahul Sivalenka I'm able to query the individual properties of custom dimensions as described here: Application Insights and Azure Stream Analytics Query a custom JSON property
My custom dimensions looks like this when exporting manually into CSV file:
"{""OperatingSystemVersion"":""10.0.18362.418"",""OperatingSystem"":""WINDOWS"",""RuntimePlatform"":""UWP"",""Manufacturer"":""LENOVO"",""ScreenHeight"":""696"",""IsSimulator"":""False"",""ScreenWidth"":""1366"",""Language"":""it"",""IsTablet"":""False"",""Model"":""LENOVO_BI_IDEAPAD4Q_BU_idea_FM_""}"
Additionally to the single columns I like to have the whole custom dimensions as a string in a SQL Table column (varchar(max)).
In the "Test results" of my Data Stream Output Query I see the column as formated above - but when really exporting / wrinting into SQL DB all my tests ended having only the value "Array" or "Record" as value in my SQL Table column.
What do I have to do in the Data Stream Query to get the whole custom dimensions value as a string and I'm able to write this into SQL Table as a whole string?
What do I have to do in the Data Stream Query to get the whole custom
dimensions value as a string and I'm able to write this into SQL Table
as a whole string?
You could use UDF to merge all key-values of single raw into one single json format string.
UDF:
function main(raw) {
let str = "{";
for(let key in raw) {
str = str + "\""+ key+"\":\""+raw[key]+"\",";
}
str += "}";
return str;
}
SQL:
SELECT udf.jsonstring(INPUT1) FROM INPUT1
Output:
The answer brought me on the right track.
The above script don't include the values as expected. So I modified the script to get it work as needed:
function main(dimensions) {
let str = "{";
for (let i in dimensions)
{
let dim = dimensions[i];
for (let key in dim)
{
str = str + "\"" + key+ "\":\""+dim[key]+"\",";
}
}
str += "}";
return str;
}
Selecting:
WITH pageViews as (
SELECT
V.ArrayValue.name as pageName
, *
, customDimensions = UDF.flattenCustomDimensions(A.context.custom.dimensions)
, customDimensionsString = UDF.createCustomDimesionsString(A.context.custom.dimensions)
FROM [AIInput] as A
CROSS APPLY GetElements(A.[view]) as V
)
With this I'm getting the custom dimensions string as follow in my SQL table:
{"Language":"tr","IsSimulator":"False","ScreenWidth":"1366","Manufacturer":"Hewlett-Packard","OperatingSystem":"WINDOWS","IsTablet":"False","Model":"K8K51ES#AB8","OperatingSystemVersion":"10.0.17763.805","ScreenHeight":"696","RuntimePlatform":"UWP",}

Loadrunner Parameters in JSON String

I'm trying to use a parameter inside of a JSON string, and would like to use an inner parameter to replace an GUID. I've changed the default parameter start and end characters since curly braces are used in JSON.
I've tried to do something like this, where the json param contains my json which is similar to this below.
{"DashboardGUID":"<Dash_GUID>"}
request_json = lr_eval_string("<json>");
lr_save_string(request_json, "request_json_param");
I'm expecting the lr_eval_string to replace the with the GUID that's in this parameter, what's the best why of replacing this ID in my JSON String?
Not sure what you are asking but I will put this here in case someone comes here in the future:
main.c
Action()
{
lr_eval_json("Buffer/File=my_json.json", "JsonObject=MJO",LAST);
lr_json_stringify("JsonObject=MJO","Format=compact", "OutputParam=newJsonBody",LAST);
lr_save_string(lr_eval_string(lr_eval_string("{newJsonBody}")),"tmp");
web_reg_find("Text={mydate}",LAST);
web_rest("POST",
"URL=http://myServer.microfocus.com/url",
"Method=POST",
"EncType=raw",
"Body={tmp}",
HEADERS,
"Name=Content-Type", "Value=application/json", ENDHEADER,
LAST);
return 0;
}
my_json.json
{
"LastActionId": 0,
"Updated": "{mydate}"
}
Okay so instead of doing what I'm thinking above I ended up creating an array of char's with this {"DashboardGUID":"<Dash_GUID>", someotherdata:"123"} in 10 different positions within the array. I then randomly selected an element from this array and when doing the lr_eval_string the parameter was replaced.
Hopefully this makes sense those looking to do something similar.

Groovy Map Issue with Variable Properties and String INterpolation

I have been navigating map structures fine for a long time now. Yet, for some reason, the root of this problem escapes me. I've tried bracket notation as well, no luck.
Why doesn't the final output (null) return "[serverinfo:[listenPort:19001]]"
If I replace the two instances of ' "$instanceName" ' with simply ' services ', it works.
String instanceName = "Services"
Map serverNode = [
instances:[
"$instanceName":[
serverinfo:[
listenPort:19001
]
]
]
]
println "$instanceName"
println serverNode.instances
println serverNode.instances."$instanceName"
//output
Services
[Services:[serverinfo:[listenPort:19001]]]
null
The type of "$instanceName" is GStringImpl, not String. It's a common mistake (and hard to find!)
def serverNode = [
instances:[
("$instanceName" as String):[
serverinfo:[
listenPort:19001
]
]
]
]
as stated by #tim_yates in comment, if your interpolated string is as simple as in this example (ie ,"${property}"), then you can use the (property) syntax : Groovy put the value of the property as a key, not the word "property"

Resources