Extracting values with jq only when they exist - jq

I have a large file of records that contain fields that look something like this:
{
"id": "1000001",
"updatedDate": "2018-12-21T01:52:00Z",
"createdDate": "1993-11-30T02:59:25Z",
"varFields": [
{
"fieldTag": "b",
"content": "1000679727"
},
{
"fieldTag": "v",
"content": "v.1"
}
}
I need to extract the .content element along with other things, but only when the fieldTag associated with it is "v". Only some records contain a fieldTag "v".
When I try to parse using
(.varFields[] |select(.fieldTag=="v") | "\(.content)") // ""
it works fine so long as v is present. However, when it is not present, I get
jq: error (at <stdin>:353953): Cannot iterate over null (null)
I tried to get rid of the error with multiple variations, including things to the effect of
(select((.varFields[] |select(.fieldTag=="v") | .content) != null) | .varFields[] |select(.fieldTag=="v") | "\(.content)") // ""
but I'm still getting the same error. What am I missing?

Take a look at the error suppression operator ? that works a bit like the new ?. nullable chaining operator in Javascript.
The ? operator, used as EXP?, is shorthand for try EXP.
Example:
jq '[.[]|(.a)?]'
Input [{}, true, {"a":1}]
Output [null, 1]
They have a slightly simpler demonstrable example of this at https://jqplay.org/jq?q=%5B.%5B%5D%7C(.a)%3F%5D&j=%5B%7B%7D%2C%20true%2C%20%7B%22a%22%3A1%7D%5D and the try-catch operator is similar if all you need is custom error handling (or just error ignoring...).

Related

Kusto extractjson not working with email address

I am attempting to use the extractjson() method that includes email addresses in the source data (specifically the # symbol).
let T = datatable(MyString:string)
[
'{"user#domain.com": {"value":10}, "userdomain.com": { "value": 5}}'
];
T
| project extractjson('$.["user#domain.com"].value', MyString)
This results in a null being returned, changing the JSONPath to '$.["userdomain.com"].value' does return the correct result.
Results
I know the # sign is a used as the current node in a filter expression, does this need to be escaped when used with KQL?
Just as a side note, I run the same test using nodes 'jsonpath' package and this worked as expected.
const jp = require('jsonpath');
const data = {"user#domain.com": {"value":10}, "name2": { "value": 5}};
console.log(jp.query(data, '$["user#domain.com"].score'));
you can use the parse_json() function instead, and when you don't have to use extract_json():
print MyString = '{"user#domain.com": {"value":10}, "userdomain.com": { "value": 5}}'
| project parse_json(MyString)["user#domain.com"].value
MyString_user#domain.com_value
10
From the documentation: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/extractjsonfunction

Does Boto3 DynamoDB have reserved attribute names for update_item with conditions expressions? Unexpected attribute SET behavior

I've implemented a simple object versioning scheme that allows the calling code to supply a current_version integer that that will set the ConditionExpression. I've also implemented a simple timestamping scheme to set an attribute named auto_timestamp to the current unix timestamp.
When the ConditionExpression is supplied with the object's current version integer, the update occurs, but also sets auto_timestamp to the current version value, rather than the value supplied in ExpressionAttributeValues. This only occurs if the attribute names are #a0, #a1 ... and values are :v0, :v1 ...
For example, this runs as expected without the condition, and auto_timestamp is set to 1643476414 in the table. The if_not_exists is used to start the object version at 0 if the item does not yet exist or did not previously have a auto_object_version attribute.
update_kwargs = {
"Key": {"user_id": user_id},
"UpdateExpression": 'SET #a0 = :v0, #a1 = if_not_exists(#a1, :zero) + :v1',
"ExpressionAttributeNames": {"#a0": "auto_timestamp", "#a1": "auto_object_version"},
"ExpressionAttributeValues": {":v0": 1643476414, ":v1": 1, ":zero": 0}
}
table.update_item(**update_kwargs)
However, this example runs without exception, but auto_timestamp is set to 1. This behavior continues for each subsequent increment of current_version for additional calls to update_item
from boto3.dynamodb.conditions import Attr
update_kwargs = {
"Key": {"user_id": user_id},
"UpdateExpression": 'SET #a0 = :v0, #a1 = if_not_exists(#a1, :zero) + :v1',
"ExpressionAttributeNames": {"#a0": "auto_timestamp", "#a1": "auto_object_version"},
"ExpressionAttributeValues": {":v0": 1643476414, ":v1": 1, ":zero": 0}
"ConditionExpression": Attr("auto_object_version").eq(1)
}
table.update_item(**update_kwargs)
While debugging, I changed the scheme by which I am labeling the attribute names and values to use #att instead of #a and :val instead of :v and the following works as desired and auto_timestamp is set to 1643476414:
from boto3.dynamodb.conditions import Attr
update_kwargs = {
"Key": {"user_id": user_id},
"UpdateExpression": 'SET #att0 = :val0, #att1 = if_not_exists(#att1, :zero) + :val1',
"ExpressionAttributeNames": {"#att0": "auto_timestamp", "#att1": "auto_object_version"},
"ExpressionAttributeValues": {":val0": 1643476414, ":val1": 1, ":zero": 0}
"ConditionExpression": Attr("auto_object_version").eq(1)
}
table.update_item(**update_kwargs)
I couldn't find any documentation on reserved attribute names or values that shouldn't be used for keys in ExpressionAttributeNames or ExpressionAttributeValues.
Is this behavior anyone has witnessed before? The behavior is easily worked around when switching the string formatting used to generate the keys but was very unexpected.
There are no reserved attribute or value names, and I routinely use names like :v1 and #a1 in my own tests, and they seem to work fine.
Assuming you correctly copied-pasted your code into the question, it seems to me you simply have a syntax error in your code - you are missing a double-quote after the "auto_timestamp. What I don't understand, though, is how this compiles or why changing a to att changed anything. Please be more careful in pasting a self-contained code snippet that works or doesn't work.

Extract values from web service JSON response with JSONPath

I have a JSON response from web service that looks something like this :
[
{
"id":4,
"sourceID":null,
"subject":"SomeSubjectOne",
"category":"SomeCategoryTwo",
"impact":null,
"status":"completed"
},
{
"id":12,
"sourceID":null,
"subject":"SomeSubjectTwo",
"category":"SomeCategoryTwo",
"impact":null,
"status":"assigned"
}
]
What I need to do is extract the subjects from all of the entities by using JSONPATH query.
How can I get these results :
Subject from the first item - SomeSubjectOne
Filter on specific subject value from all entities (SomeSubjectTwo for example)
Get Subjects from all entities
Goessner's orinial JSONPath article is a good reference point and all implementations more or less stick to the suggested query syntax. However, implementations like Jayway JsonPath/Java, JSONPath-Plus/JavaScript, flow-jsonpath/PHP may behave a little differently in some areas. That's why it can be important to know what implementation you are actually using.
Subject from the first item
Just use an index to select the desired array element.
$.[0].subject
Returns:
SomeSubjectOne
Specific subject value
First, go for any elements .., check those with a subject [?(#.subject] and use == '..' for comparison.
$..[?(#.subject == 'SomeSubjectTwo')]
Returns
[ {
"id" : 12,
"sourceID" : null,
"subject" : "SomeSubjectTwo",
"category" : "SomeCategoryTwo",
"impact" : null,
"status" : "assigned" } ]*
Get all subjects
$.[*].subject
or simply
$..subject
Returns
[ "SomeSubjectOne", "SomeSubjectTwo" ]

JmesPath join or concatenate nested array elements

I realize there are several other JmesPath join questions here, but I'm having trouble with a separate problem that I haven't found any examples for, where I need to concatenate (ie, join) a set of JSON values that have dynamically-named keys into a single element.
If I start with the following JSON data structure:
{
"data": [
{
"secmeetingdays":
{
"dayset_01":
{
"day_01": "M",
"day_02": "W",
"day_03": "F"
},
"dayset_02":
{
"day_01": "T",
"day_02": "TH"
}
},
}]
}
I would like to end up with something like this:
[
[
"M,W,F"
],
[
"T,TH"
]
]
I've started the query to flatten the data down, but am completely stuck with the join syntax. Nothing I try seems to be working.
Attempt 1: data[].secmeetingdays | [0].*.*
[
[
"M",
"W",
"F"
],
[
"T",
"TH"
]
]
Almost, but not quite there.
Attempt 2: data[].secmeetingdays | [0].*.* | {join(',',#)}
fails
Attempt 3: data[].secmeetingdays | [0].*.*.join(',',#)
fails
Attempt 4: data[].secmeetingdays | {join(',',#[0].*.*)}
fails
I tried avoiding 2 flattens to have some reference to grab onto inside the join.
Attempt 4 data[].secmeetingdays | [0].* | join(',',#[]).
fails
Attempt 6 data[].secmeetingdays | [0].*.* | #.join(',',[]) Gives a result, but it's not what I want:
"M,W,F,T,TH"
Update:
Attempt 7 data[].secmeetingdays[].*.* | [].join(',',#) gets me a lot closer but is also not exactly what I need:
[
"M,W,F",
"T,TH"
]
I might be able to work with this solution, but will leave this open in case someone has the accurate answer to the question.
The example here https://jmespath.org/ has a join, but it is only on a single list of items. How can I join the sub-arrays without affecting the structure of the parents?
data[*].secmeetingdays.values(#)[].values(#).join(',', #).to_array(#)
Gives you the example desired output but I see no benefit to wrapping each single string in an extra array.
data[].secmeetingdays.values(#) | [*][*].values(#).join(',', #)
Produces more logical output (to me) because it gives an array of daysets for each item in the data array:
[
[
"M,W,F",
"T,TH"
]
]
Note that the proper way to deal with such data is to write a script that iterates the objects, parses the keys and guarantees ordered output after sorting the items. JSON parsers have no obligation to keep object properties ordered the same as they were stored/read, so blindly converting to an array as above is not certain to be the order you desire. Using key names to store order is superfluous. Chronologically ordered data should be stored as arrays like so:
{
"data": [
{
"secmeetingdays": [
[
"M",
"W",
"F"
],
[
"T",
"TH"
]
]
}
]
}
[[0].title,[1].title].join(',', #).to_array(#)
RESULT: ["some1,some2"]
[[0].title,[1].title].join(',', #)
RESULT: "some1,some2"
[[0].title,[1].title]
RESULT: ["some1,some2"]

Display empty line for non existing fields with jq

I have the following json data:
{"jsonrpc":"2.0","result":[],"id":1}
{"jsonrpc":"2.0","result":[{"hostmacroid":"2392","hostid":"10953","macro":"{$GATEWAY}","value":"10.25.230.1"}],"id":1}
{"jsonrpc":"2.0","result":[{"hostmacroid":"1893","hostid":"12093","macro":"{$GATEWAY}","value":"10.38.118.1"}],"id":1}
{"jsonrpc":"2.0","result":[{"hostmacroid":"2400","hostid":"14471","macro":"{$GATEWAY}","value":"10.25.230.1"}],"id":1}
{"jsonrpc":"2.0","result":[{"hostmacroid":"799","hostid":"10798","macro":"{$GATEWAY}","value":"10.36.136.1"}],"id":1}
{"jsonrpc":"2.0","result":[],"id":1}
{"jsonrpc":"2.0","result":[{"hostmacroid":"1433","hostid":"10857","macro":"{$GATEWAY}","value":"10.38.24.129"}],"id":1}
{"jsonrpc":"2.0","result":[{"hostmacroid":"842","hostid":"13159","macro":"{$GATEWAY}","value":"10.38.113.1"}],"id":1}
{"jsonrpc":"2.0","result":[],"id":1}
I am trying to extract the value of the "value" field from each line. jq -r '.result[].value' <jsonfile> works perfectly but it does not take into account the JSON lines where there is no "value" field. I would like it to print an empty line for them. Is this possible with jq?
You can use this:
jq -r '.result[].value // "" ' a.json
This uses the or operator //. If .result[].value is present, the value will get printed, otherwise an empty line gets printed.
This would work:
jq -r '.result | if length > 0 then .[0].value else "" end'
Since false // X and null // X produce X, .result[].value // "" may not be what you want in all cases.
To achieve the stated goal as I understand it, you could use the following filter:
.result[] | if has("value") then .value else "" end

Resources