How to use jq to format array of objects to separated list of key values - jq

How can I (generically) transform the input file below to the output file below, using jq. The record format of the output file is: array_index | key | value
Input file:
[{"a": 1, "b": 10},
{"a": 2, "d": "fred", "e": 30}]
Output File:
0|a|1
0|b|10
1|a|2
1|d|fred
1|e|30

Here's a solution using tostream, which creates a stream of paths and their values. Filter out those having values using select, flatten to align both, and join for the output format:
jq -r 'tostream | select(has(1)) | flatten | join("|")'
0|a|1
0|b|10
1|a|2
1|d|fred
1|e|30
Demo
Or a very similar one using paths to get the paths, scalars for the filter, and getpath for the corresponding value:
jq -r 'paths(scalars) as $p | [$p[], getpath($p)] | join("|")'
0|a|1
0|b|10
1|a|2
1|d|fred
1|e|30
Demo

< file.json jq -r 'to_entries
| .[]
| .key as $k
| ((.value | to_entries )[]
| [$k, .key, .value])
| #csv'
Output:
0,"a",1
0,"b",10
1,"a",2
1,"d","fred"
1,"e",30
You just need to remove the double quotes.

to_entries can be used to loop over the elements of arrays and objects in a way that gives both the key (index) and the value of the element.
jq -r '
to_entries[] |
.key as $id |
.value |
to_entries[] |
[ $id, .key, .value ] |
join("|")
'
Demo on jqplay
Replace join("|") with #csv to get proper CSV.

Related

combine multiple jq commands into one?

I use the following command to extract data from a json file. Suppose that I just keep the first match if there is a match. (If there is no match, an error should be printed.)
.[] | select(.[1] | objects."/Type".N == "/Catalog") | .[1]."/Dests"
Let's say the output is 16876. Then, I use the following jq code to extract the data.
.[] | select(.[0] == 16876) | .[1] | to_entries[] | [.key, .value[0], .value[2].F, .value[3].F] | #tsv
This involves multiple passes of the input json data. Can the two jq commands be combined into one, so that one pass of the input json data is sufficient?
Use as $dests to bind the results of your first query branch to the name $dests, and then refer back to those results in the second branch within a single copy of jq.
(.[] | select(.[1] | objects."/Type".N == "/Catalog") | .[1]."/Dests") as $dests
| .[] | select(.[0] == $dests) | .[1] | to_entries[] | [.key, .value[0], .value[2].F, .value[3].F] | #tsv

jq: Why do two expressions which produce identical output produce different output when surrounded by an array operator?

I have been trying to understand jq, and the following piuzzle is giving me a headache: I can construct two expressions, A and B, which seem to produce the same output. And yet, when I surround them with [] array construction braces (as in [A] and [B]), they produce different output. In this case, the expressions are:
A := jq '. | add'
B := jq -s `.[] | add`
Concretely:
$ echo '[1,2] [3,4]' | jq '.'
[1,2]
[3,4]
$ echo '[1,2] [3,4]' | jq '. | add'
3
7
# Now surround with array construction and we get two values:
$ echo '[1,2] [3,4]' | jq '[. | add]'
[3]
[7]
$ echo '[1,2] [3,4]' | jq -s '.[]'
[1,2]
[3,4]
$ echo '[1,2] [3,4]' | jq -s '.[] | add'
3
7
# Now surround with array construction and we get only one value:
$ echo '[1,2] [3,4]' | jq -s '[.[] | add]'
[3,7]
What is going on here? Why is it that the B expression, which applies the --slurp setting but appears to produce identical intermediate output to the A expression, produces different output when surrounded with [] array construction brackets?
When jq is fed with a stream, just like [1,2] [3,4] with two inputs, it executes the filter independently for each. That's why jq '[. | add]' will produce two results; each addition will separately be wrapped into an array.
When jq is given the --slurp option, it combines the stream to an array, rendering it just one input. Therefore jq -s '[.[] | add]' will have one result only; the multiple additions will be caught by the array constructor, which is executed just once.

How to format my JSON input to SQL insert statements

How can I (generically) transform the input file below to output file below, using jq:
Input file:
[{"a": 1, "b": 10},
{"a": 2, "d": "fred", "e": 30}]
Output file:
INSERT INTO mytab (a,b) VALUES (1,10);
INSERT INTO mytab (a,d,e) VALUES (2,"fred",30);
Using a combination of string interpolation and two variants of string join operations, one using the #csv filter and other using join(",")
jq --raw-output '
.[]| to_entries | map(.key) as $k | map(.value) as $v |
"INSERT INTO mytab (\($k | join(","))) VALUES (\($v | #csv ));"'
jqplay demo

processing TSV embedded with JSON using jq?

$ jq --slurp '.[] | .a' <<< '{"a": 1}'$'\n''{"a": 2}'
1
2
I can process a one-column TSV file like above. When there are multiple columns and one column is JSON, how to print the processing result of the JSON column alone with other columns literally? In the following example, how to print the first column and the JSON processing result of the 2nd column?
$ jq --slurp '.[] | .a' <<< $'A\t{"a": 1}'$'\nB\t{"a": 2}'
parse error: Invalid numeric literal at line 1, column 2
Before piping your TSV file into jq you should extract the JSON column first. For instance, use cut from GNU coreutils to get the second field in a tab-separated line:
cut -f2 <<< $'A\t{"a": 1}'$'\nB\t{"a": 2}' | jq --slurp '.[] | .a'
In order to print the other columns as well, you may use paste to put the columns back together:
paste <(
cut -f1 <<< $'A\t{"a": 1}'$'\nB\t{"a": 2}'
) <(
cut -f2 <<< $'A\t{"a": 1}'$'\nB\t{"a": 2}' | jq --slurp '.[] | .a'
)
To solve this entirely in jq you have to read it as non-JSON first and interpret the second column as JSON using jq's fromjson
jq -Rr './"\t" | .[1] |= (fromjson | .a) | #tsv' <<< $'A\t{"a": 1}'$'\nB\t{"a": 2}'
jq --raw-input --raw-output --slurp 'split("\n") | map(split("\t")) | map(select(length>0)) | .[] | {"p":.[0], "j":.[1] | fromjson} | [.p, .j.a] | #tsv' <<< $'A\t{"a": 1}'$'\nB\t{"a": 2}'
A 1
B 2
or process line by line for huge data
cat ./data.txt | while read line;
do
echo "$line" | jq --raw-input --raw-output --slurp 'split("\t") | {"p":.[0], "j":.[1] | fromjson} | [.p, .j.a] | #tsv'
done

How do i add an index in jq

I want to use jq map my input
["a", "b"]
to output
[{name: "a", index: 0}, {name: "b", index: 1}]
I got as far as
0 as $i | def incr: $i = $i + 1; [.[] | {name:., index:incr}]'
which outputs:
[
{
"name": "a",
"index": 1
},
{
"name": "b",
"index": 1
}
]
But I'm missing something.
Any ideas?
It's easier than you think.
to_entries | map({name:.value, index:.key})
to_entries takes an object and returns an array of key/value pairs. In the case of arrays, it effectively makes index/value pairs. You could map those pairs to the items you wanted.
A more "hands-on" approach is to use reduce:
["a", "b"] | . as $in | reduce range(0;length) as $i ([]; . + [{"name": $in[$i], "index": $i}])
Here are a few more ways. Assuming input.json contains your data
["a", "b"]
and you invoke jq as
jq -M -c -f filter.jq input.json
then any of the following filter.jq filters will generate
{"name":"a","index":0}
{"name":"b","index":1}
1) using keys and foreach
foreach keys[] as $k (.;.;[$k,.[$k]])
| {name:.[1], index:.[0]}
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
keys[] as $k
| [$k, .[$k]]
| {name:.[1], index:.[0]}
which can be simplified to
keys[] as $k
| {name:.[$k], index:$k}
2) using keys and transpose
[keys, .]
| transpose[]
| {name:.[1], index:.[0]}
3) using a function
def enumerate:
def _enum(i):
if length<1
then empty
else [i, .[0]], (.[1:] | _enum(i+1))
end
;
_enum(0)
;
enumerate
| {name:.[1], index:.[0]}

Resources