I have a JSON file that I want to process with JQ. It has an array of objects inside another object, with a key that I want to use to populate a new array.
In my real use-case this is nested in with a lot of other fluff and there lots more arrays but take this as a simpler but representative example of the kind of thing:
{
"numbers": [
{
"numeral": 1,
"ordinal": "1st",
"word": "One"
},
{
"numeral": 2,
"ordinal": "2nd",
"word": "Two"
},
{
"numeral": 5,
"ordinal": "5th",
"word": "Five"
},
{
"some-other-fluff-i-want-to-ignore": true
}
]
}
I'd like to use JQ to get a new array based on the elements, ignoring some elements and handling the missing ones. e.g.
[
"The 1st word is One",
"The 2nd word is Two",
"Wot no number 3?",
"Wot no number 4?",
"The 5th word is Five"
]
Doing this in a loop for the elements that are there is simple, terse and elegant enough:
.numbers | map( . | select( .numeral) | [ "The", .ordinal, "word is", .word ] | join (" "))
But I can't find a way to cope with the missing entries. I have some code that sort-of works:
.numbers | [
( .[] | select(.numeral == 1) | ( [ "The", .ordinal, "word is", .word ] | join (" ")) ) // "Wot no number 1?",
( .[] | select(.numeral == 2) | ( [ "The", .ordinal, "word is", .word ] | join (" ")) ) // "Wot no number 2?",
( .[] | select(.numeral == 3) | ( [ "The", .ordinal, "word is", .word ] | join (" ")) ) // "Wot no number 3?",
( .[] | select(.numeral == 4) | ( [ "The", .ordinal, "word is", .word ] | join (" ")) ) // "Wot no number 4?",
( .[] | select(.numeral == 5) | ( [ "The", .ordinal, "word is", .word ] | join (" ")) ) // "Wot no number 5?"
]
It produces usable output, after a fashion:
richard#sophia:~$ jq -f make-array.jq < numbers.json
[
"The 1st word is One",
"The 2nd word is Two",
"Wot no number 3?",
"Wot no number 4?",
"The 5th word is Five"
]
richard#sophia:~$
However, whilst it produces the output, handles the missing elements and ignores the bits I don't want, it's obviously extremely naff code that cries out for a for-loop or something similar but I can't see a way in JQ to do this. Any ideas?
jq solution:
jq 'def print(o): "The \(o.ordinal) word is \(o.word)";
.numbers | (reduce map(select(.numeral))[] as $o ({}; .["\($o.numeral)"] = $o)) as $o
| [range(0; ($o | [keys[] | tonumber] | max))
| "\(.+1)" as $i
| if ($o[$i]) then print($o[$i]) else "Wot no number \($i)?" end
]' input.json
The output:
[
"The 1st word is One",
"The 2nd word is Two",
"Wot no number 3?",
"Wot no number 4?",
"The 5th word is Five"
]
Another solution !
jq '[
range(1; ( .numbers | max_by(.numeral)|.numeral ) +1 ) as $range_do_diplay |
.numbers as $thedata | $range_do_diplay |
. as $i |
if ([$thedata[]|contains( { numeral: $i })]|any )
then
($thedata|map(select( .numeral == $i )))|.[0]| "The \(.ordinal) word is \(.word) "
else
"Wot no number \($i)?"
end
] ' numbers.json
This solution use
max_by to find the max value of numeral
range to generate a list o values
use variables to store intermediate value
Related
I would like to ask for help. I need to split values from key "Text" on base space " " and join to one line. In actually code I calculate with exactly position but if key Text has S10 is show only S1.
My input
[
{
"PartNumber": "5SE32DFVLG002",
"ClassificationNo": "500001",
"StringValue": "R0050SWSW",
"Field": "95001",
"Text": "S1 W1 cr.sec+colour"
},
{
"PartNumber": "5SE32DFVLG002",
"ClassificationNo": "500001",
"StringValue": "R0050SWSW",
"Field": "95004",
"Text": "S1 W10 cr.sec+colour"
}
]
My actually condition in jq play
[.Oslm[] | select(.ClassificationNo=="500001" and .StringValue!="") |
{PartNumber,ClassificationNo,StringValue,Field,Text}] |
sort_by(.Field) | .[] | [.PartNumber,.ClassificationNo,
.Field[3:5],.Text[0:2] + "-" + .Text[3:5] + .StringValue[0:1], "Test
", .StringValue[1:10]] | join(";")
Actual result
5SE32DFVLG002;500001;95001;S1-W1R;TEST;0050SWSW
5SE32DFVLG002;500001;95004;S1-W1R;TEST;0050SWSW
I would like to have this result
5SE32DFVLG002;500001;95001;S1-W1R;TEST;0050SWSW
5SE32DFVLG002;500001;95004;S1-W10R;TEST;0050SWSW
Modify the part involving generation of .Text to something simpler using split() method in jq that can be used to split on a single white-space. This way, you are not reliant on the length of the sub-fields you want to extract
( .Text | split(" ") | .[0] + "-" + .[1] ) + .StringValue[0:1]
i.e. with full code
.[] | [ select( .ClassificationNo =="500001" and .StringValue != "" ) |
{
PartNumber,
ClassificationNo,
StringValue,
Field,
Text
} ] |
sort_by(.Field) |
map(
.PartNumber,
.ClassificationNo,
.Field[3:5],
( .Text | split(" ") | .[0] + "-" + .[1] ) + .StringValue[0:1],
"Test", .StringValue[1:10]
) |
join(";")
demo at jqplay
My JSON is an array of one object like this:
[{
"id": 125650,
"status": "success",
"name": "build_job",
"artifacts": [
{
"file_type": "archive",
"size": 72720116,
"filename": "artifacts.zip",
"file_format": "zip"
},
{
"file_type": "metadata",
"size": 1406,
"filename": "metadata.gz",
"file_format": "gzip"
}
]
}]
I want to select only the object ID if the following conditions matches:
status == success
name == build_job
artifacts.size > 0 where file_type == archive
I'm stuck on the last condition, I can select artifacts with size > 0, OR artifacts where file_type = archive, but not both at the same time.
Here's my current query :
| jq '.[0] | select(.name == "build_job" and .status == "success" and .artifacts[].file_type == "archive") | .id'
Can you help me with that ?
For the last condition, you presumably mean something like:
all(.artifacts[];
if .file_type == "archive" then .size > 0 else true end)
which can also be written as:
all(.artifacts[] | select(.file_type == "archive");
.size > 0)
Iād recommend using either all or any, depending on your requirements.
Try this:
.[0] | select(
.name == "build_job" and .status == "success" and (
.artifacts[] | select(.file_type == "archive") | length > 0
)
) | .id
This selects successful build_jobs containing one or more archive artifacts. Unfortunately, multiple ids are returned if there's more than one such artifacts. Here's how to wrap the expression to fix that:
[
.[] | select(
.name == "build_job" and .status == "success" and (
.artifacts[] | select(.file_type == "archive") | length > 0
)
)
] | unique | .[].id
For the last condition, take the array .artifacts, reduce it to those elements matching your criteria map(select(.file_type == "archive")) and test the resulting array's length length > 0.
All together:
.[0] | select(
.name == "build_job" and
.status == "success" and (
(.artifacts | map(select(.file_type == "archive"))) | length > 0
)
)
| .id
So I have big json, where I need to take some subtree and copy it to other place, but with some properties updated (a lot of them). So for example:
{
"items": [
{ "id": 1, "other": "abc"},
{ "id": 2, "other": "def"},
{ "id": 3, "other": "ghi"}
]
}
and say, that i'd like to duplicate record having id == 2, and replace char e in other field with char x using regex. That could go (I'm sure there is a better way, but I'm beginner) something like:
jq '.items |= . + [.[]|select (.id == 2) as $orig | .id=4 | .other=($orig.other | sub("e";"x"))]'<sample.json
producing
{
"items": [
{
"id": 1,
"other": "abc"
},
{
"id": 2,
"other": "def"
},
{
"id": 3,
"other": "ghi"
},
{
"id": 4,
"other": "dxf"
}
]
}
Now that's great. But suppose, that there ins't just one other field. There are multitude of them, and over deep tree. Well I can issue multiple sub operations, but assuming, that replacement pattern is sufficiently selective, maybe we can turn the whole JSON subtree to string (trivial, tostring method) and replace all occurences using singe sub call. But how to turn that substituted string back to ā is it call object? ā to be able to add it back to items array?
Here's a program that might be a solution to the general problem you are describing, but if not at least illustrates how problems of this type can be solved. Note in particular that there is no explicit reference to a field named "other", and that (thanks to walk) the update function is applied to all candidate JSON objects in the input.
def update($n):
if .items | length > 0
then ((.items[0]|keys_unsorted) - ["id"]) as $keys
| if ($keys | length) == 1
then $keys[0] as $key
| (.items|map(.id) | max + 1) as $newid
| .items |= . + [.[] | select(.id == $n) as $orig | .id=$newid | .[$key]=($orig[$key] | sub("e";"x"))]
else .
end
else .
end;
walk(if type == "object" and has("items") then update(2) else . end)
Now, this is somewhat similar to jq: select only an array which contains element A but not element B but it somehow doesn't work for me (which is likely my fault)... ;-)
So here's what we have:
[ {
"employeeType": "student",
"cn": "dc8aff1",
"uid": "dc8aff1",
"ou": [
"4210910",
"4210910 #Abg",
"4210910 Abgang",
"4240115",
"4240115 5",
"4240115 5\/5"
]
},
{
"employeeType": "student",
"cn": "160f656",
"uid": "160f656",
"ou": [
"4210910",
"4210910 3",
"4210910 3a"
] } ]
I'd like to select all elements where ou does not contain a specific string, say "4210910 3a" or - which would be even better - where ou does not contain any member of a given list of strings.
When it comes to possibly changing inputs, you should make it a parameter to your filter, rather than hardcoding it in. Also, using contains might not work for you in general. It runs the filter recursively so even substrings will match which might not be preferred.
For example:
["10", "20", "30", "40", "50"] | contains(["0"])
is true
I would write it like this:
$ jq --argjson ex '["4210910 3a"]' 'map(select(all(.ou[]; $ex[]!=.)))' input.json
This response addresses the case where .ou is an array and we are given another array of forbidden strings.
For clarity, let's define a filter, intersectq(a;b), that will return true iff the arrays have an element in common:
def intersectq(a;b):
any(a[]; . as $x | any( b[]; . == $x) );
This is effectively a loop-within-a-loop, but because of the semantics of any/2, the computation will stop once a match has been found.(*)
Assuming $ex is the list of exceptions, then the filter we could use to solve the problem would be:
map(select(intersectq(.ou; $ex) | not))
For example, we could use an invocation along the lines suggested by Jeff:
$ jq --argjson ex '["4210910 3a"]' -f myfilter.jq input.json
Now you might ask: why use the any-within-any double loop rather than .[]-within-all double loop? The answer is efficiency, as can be seen using debug:
$ jq -n '[1,2,3] as $a | [1,1] as $b | all( $a[]; ($b[] | debug) != .)'
["DEBUG:",1]
["DEBUG:",1]
false
$ jq -n '[1,2,3] as $a | [1,1] as $b | all( $a[]; . as $x | all( $b[]; debug | $x != .))'
["DEBUG:",1]
false
(*) Footnote
Of course intersectq/2 as defined here is still O(m*n) and thus inefficient, but the main point of this post is to highlight the drawback of the .[]-within-all double loop.
Here is a solution that checks the .ou member of each element of the input using foreach and contains.
["4210910 3a"] as $list # adjust as necessary
| .[]
| foreach $list[] as $e (
.; .; if .ou | contains([$e]) then . else empty end
)
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
["4210910 3a"] as $list
| .[]
| $list[] as $e
| if .ou | contains([$e]) then . else empty end
I want to use jq map my input
["a", "b"]
to output
[{name: "a", index: 0}, {name: "b", index: 1}]
I got as far as
0 as $i | def incr: $i = $i + 1; [.[] | {name:., index:incr}]'
which outputs:
[
{
"name": "a",
"index": 1
},
{
"name": "b",
"index": 1
}
]
But I'm missing something.
Any ideas?
It's easier than you think.
to_entries | map({name:.value, index:.key})
to_entries takes an object and returns an array of key/value pairs. In the case of arrays, it effectively makes index/value pairs. You could map those pairs to the items you wanted.
A more "hands-on" approach is to use reduce:
["a", "b"] | . as $in | reduce range(0;length) as $i ([]; . + [{"name": $in[$i], "index": $i}])
Here are a few more ways. Assuming input.json contains your data
["a", "b"]
and you invoke jq as
jq -M -c -f filter.jq input.json
then any of the following filter.jq filters will generate
{"name":"a","index":0}
{"name":"b","index":1}
1) using keys and foreach
foreach keys[] as $k (.;.;[$k,.[$k]])
| {name:.[1], index:.[0]}
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
keys[] as $k
| [$k, .[$k]]
| {name:.[1], index:.[0]}
which can be simplified to
keys[] as $k
| {name:.[$k], index:$k}
2) using keys and transpose
[keys, .]
| transpose[]
| {name:.[1], index:.[0]}
3) using a function
def enumerate:
def _enum(i):
if length<1
then empty
else [i, .[0]], (.[1:] | _enum(i+1))
end
;
_enum(0)
;
enumerate
| {name:.[1], index:.[0]}