Parse JSON inside JSON attribute using JSONPath - jsonpath

I have a JSON list where one of the attributes of each element happens to be a JSON itself. It comes from a poor design upfront, but here it is.
I want to query the distinct attributes inside the JSON string contained in the elements.
Here is an example, just one item. I hand-wrote the code, but believe me that is valid JSON in production by the way it's generated
[{
"extraData": "{\"foo\":\"bar\"}"
}]
I would like to query for something like $.*.extraData.foo, but obviously such syntax does not work.
I am using IntelliJ IDEA Jsonpath evaluator.
The syntax should be something like parse($.*.extraData).*.foo
This article suggests me that no such operator is available to parse a JSON inside a JSON
It has to be JSONPath only for data analysis purposes. In Java, I use Jackson to parse the extraData object as a JsonNode, but my goal is to explore the large data set, and perhaps obtain some distinct values I want to use for enumeration purposes.

To JSON Path, the embedded JSON is just a string. The best you could do with a single operation is use a RegEx in the expression selector, but getting RegEx to identify JSON is really tricky, and that's if your implementation even supports RegEx.
I think the best option would be to use two paths:
The first path gets the embedded JSON. $.*.extraData
Parse that value
The second part gets the data you need. $.foo
This requires some extra code, but I think it's your only realistic option.

Related

Kusto's `parse_json` doesn't work on custom dimensions

I'm hoping to be able to analyze structured data stored in a custom dimension of a custom telemetry event emitted to application insights, and getting some weird behavior. It seems like the JSON can't be parsed normally, but if I pass it through strcat it is able to parse the json just fine.
customEvents
| where name == "PbConfigFilterComponentSaved"
| take 1
| project
jsonType=gettype(customDimensions.Json),
parsedType=gettype(parse_json(customDimensions.Json)),
strcatType=gettype(strcat('', customDimensions.Json)),
strcatParsedType=gettype(parse_json(strcat('', customDimensions.Json)))
Result:
jsonType: string
parsedType: string
strcatType: string
strcatParsedType: dictionary
Is there a better approach to getting parse_json to work on this kind of value?
Update
In case it's in any way relevant, here's the value of customDimensions.Json:
{"filterComponentKey":"CatalystAgeRange","typeKey":"TemporalConstraint","uiConfig":{"name":"Age","displayMode":"Age"},"config":{"dateSelector":"pat.BirthDTS"},"disabledForScenes":false,"disabledForFilters":false}
Could you please demonstrate a sample record that isn't parsed correctly?
Speculating (before seeing the data): Have you verified the final paragraph here doesn't apply to your case?
It is somewhat common to have a JSON string describing a property bag in which one of the "slots" is another JSON string. […] In such cases, it is not only necessary to invoke parse_json twice, but also to make sure that in the second call, tostring will be used. Otherwise, the second call to parse_json will simply pass-on the input to the output as-is, because its declared type is dynamic.
The type of customDimensions is dynamic and so accessing a property like customDimensions.json from it will return a string typed as dynamic.
You have to explicitly cast it as string and then parse it:
todynamic(tostring(customDimensions.json)).property
I think the "Notes" section in the documentation is exactly the issue, as mentioned by Yoni L. in the previous answer.

Get the results of a CosmosDb query as a Raw string (payload of the http response)

I'm using the .NET API of CosmosDB and I'm getting a hard time trying to figure out how to get the raw result of a CosmosDB query prior to it getting deserialized into a class. Browsing the documentation, all examples I find cast the results to an specific class or to a dynamic. That is:
//This returns a Document, wich actually is a dynamic...
client.ReadDocumentAsync(...)
//This returns an object of type MyClass, wich I supose is casted internally by the API
client.ReadDocumentAsync<MyClass>(...)
What I want to do is to get the original raw JSON payload of the result to inspect it without the overhead of deserializing it to anything else.
Does anybody know if it's possible to get the raw result with the .NET api? If so, how?
In other cases, I need to use the result as an ExpandoObject to treat it dynamically, but I find that the "dynamic" results given by the api are not "expandables" so I'm forced to serialize them and then deserialize again in a recursive form into an ExpandoObject. Furthermore, the result is polluted with _rid, Etag, etc. properties that I don't need on my object. It's quite anoying.
I think it's an unnecesary overhead to serialize and then deserialize again, so maybe the optimus way would be to get the raw JSON result and write a method to deserialize directly to Expando.
Or maybe I'm loosing any point and there's an API to get the results as Expandos. Does anybody know it?
Check out this question that I had earlier:
Converting arbitrary json response to list of "things"
Altough I didn't named it, the API in question was actually DocumentDb, so I think you'll be able to use that code.
Seen some bad advice here, but it is built into the SDK natively.
Document doc = cosmosClient.ReadDocumentAsync(yourDocLink);
string json = doc.ToString();

extracting array of objects from xml response to js

so I'm extracting variables from my xml response and trying to reformat the objects but I want to do that the most efficient way possible. so I want to load the xml array of same objects into a js array that I can cycle thru and output the new format. I found a reference to type="nodeset" when extracting the XPath but i could not find a reference to it on the documentation.
what is the best way to load the full xml objects into a js variable and cycle thru the objects and output the new format
Thanks for any help you can give me on this.
Best way to accomplish this is with the XMLToJSON policy, a JavaScript callout in which you can mediate your payload, and then transform back with JSONToXML if you need it.
For an XML array that doesn't need filtering, you can use XPATH with type="nodeset", just as you described. This allows you to pull a node and all child nodes in a particular XPATH. As I'm sure you noticed, you can't do this by just extracting as type="string". Just know that you will need to convert the extracted variable to string before you can use the XML nodes like you do every other string. You can then do JSON.parse to take the string and manipulate the object like an array. The string conversion is as simple as calling a JS callout with the following code (if someone else has a better way, I'm all ears):
var extractedNodeSet = context.getVariable("extractedNodeSet");
var extractedNodeSetString = String(extractedNodeSet);
For an XML array that needs filtering/manipulation, I recommend to use XSLT along with the trusted <xsl:for-each select=...> element. This will let you set conditions on the XML array nodes, manipulate tags/data, and extract the data, all in one step. The only concern is that this isn't a JS array, so if you absolutely must have a JS array, then you'll need to then do an XMLtoJSON and work with the data from there.

Is it possible to convert a result set from WebMatrix.Database.Query() to an XML string?

I'd like to consume the result set on the client side using a jQuery script, however I'm uncertain how to convert it to XML, other than building the XML DOM string manually, which I'd prefer to avoid. Is there an automatic way to convert it or obtain it as an XML string?
There are a number of APIs for generating XML, including Linq TO XML. However, I would serialize your server side data to JSON if you want to make it available to jQuery. The JSON helper is perfect for this.

CouchDB: accessing nested structutes in map function

I have a document based on a xml structure that I have stored in a CouchDB database.
Some of the keys contains namespaces and are on the form "namespace:key":
{"mykey": {"nested:key": "nested value"}}
In the map function, I want to emit the nested value as a key, but the colon inside the name makes it hard...
emit(doc.mykey.nested:key, doc) <-- will not work.
Does anyone know how this can be solved?
A hint that its all just JSON and JavaScript got me some new ideas for searching.
It may be that colon in json keys ain't valid, but I found a way. By looking at the doc object as an hash, I can access my value in the following manner:
Doc.mykey['nested:key']
It works - for now...
That's because Couch is a JSON based document DB, and doc.mykey.nested:key is not a valid JSON identifier. JSON identifiers must match JavaScripts identifiers, and : is not a valid identifier character.
So, the simple answer is: "No, this won't and can't work". You need to change your identifiers.
Actually, I should qualify that.
Couch can use pretty much ANYTHING for it's views et al, and, in theory, works with any payload. But out of the box, it's just JavaScript and JSON.

Resources