Use jq to combine two arrays of objects on a certain key - jq

I am trying to use jq to solve this problem.
Suppose I have the following object
{
"listA": [
{
"id": "12345",
"code": "001"
}
]
"listB": [
{
"id": "12345",
"prop": "AABBCC"
}
]
}
In reality my two lists are longer, but the id isn't repeated within each list.
How may I combine the two lists into a single list where each item is an object with the non-id properties for the given id are collected into a single object?
For example, from the object above, I'd like the following:
{
"listC" : [
{
"id": "12345",
"code": "001",
"prop": "AABBCC"
}
]
}

A simple way would be to concatenate the arrays, group the elements by id and map each group into a single object using add;
jq '.listA+.listB | group_by(.id) | map(add)' test.json
If there may be more than two arrays you need to merge in the file, you could instead use flatten to concatenate all of them.
Test case below
# cat test.json
{
"listA": [
{ "id": "12345", "code": "001" },
{ "id": "12346", "code": "002" }
],
"listB": [
{ "id": "12345", "prop": "AABBCC" }
]
}
# jq 'flatten | group_by(.id) | map(add)' test.json
# or
# jq '.listA+.listB | group_by(.id) | map(add)' test.json
[
{
"id": "12345",
"code": "001",
"prop": "AABBCC"
},
{
"id": "12346",
"code": "002"
}
]

Using group_by entails a sort, which is unnecessary, so if efficiency is a concern, then an alternative approach such as the following should be considered:
INDEX(.listA[]; .id) as $one
| INDEX(.listB[]; .id) as $two
| reduce ($one|keys_unsorted[]) as $k ($two; .[$k] += $one[$k])
| {listC: [.[]] }

Related

Group nested array objects to parent key in JQ

I have JSON coming from an external application, formatted like so:
{
"ticket_fields": [
{
"url": "https://example.com/1122334455.json",
"id": 1122334455,
"type": "tagger",
"custom_field_options": [
{
"id": 123456789,
"name": "I have a problem",
"raw_name": "I have a problem",
"value": "help_i_have_problem",
"default": false
},
{
"id": 456789123,
"name": "I have feedback",
"raw_name": "I have feedback",
"value": "help_i_have_feedback",
"default": false
},
]
}
{
"url": "https://example.com/6677889900.json",
"id": 6677889900,
"type": "tagger",
"custom_field_options": [
{
"id": 321654987,
"name": "United States,
"raw_name": "United States",
"value": "location_123_united_states",
"default": false
},
{
"id": 987456321,
"name": "Germany",
"raw_name": "Germany",
"value": "location_456_germany",
"default": false
}
]
}
]
}
The end goal is to be able to get the data into a TSV in the sense that each object in the custom_field_options array is grouped by the parent ID (ticket_fields.id), and then transposed such that each object would be represented on a single line, like so:
Ticket Field ID
Name
Value
1122334455
I have a problem
help_i_have_problem
1122334455
I have feedback
help_i_have_feedback
6677889900
United States
location_123_united_states
6677889900
Germany
location_456_germany
I have been able to export the data successfully to TSV already, but it reads per-line, and without preserving order, like so:
Using jq -r '.ticket_fields[] | select(.type=="tagger") | [.id, .custom_field_options[].name, .custom_field_options[].value] | #tsv'
Ticket Field ID
Name
Name
Value
Value
1122334455
I have a problem
I have feedback
help_i_have_problem
help_i_have_feedback
6677889900
United States
Germany
location_123_united_states
location_456_germany
Each of the custom_field_options arrays in production may consist of any number of objects (not limited to 2 each). But I seem to be stuck on how to appropriately group or map these objects to their parent ticket_fields.id and to transpose the data in a clean manner. The select(.type=="tagger") is mentioned in the query as there are multiple values for ticket_fields.type which need to be filtered out.
Based on another answer on here, I did try variants of jq -r '.ticket_fields[] | select(.type=="tagger") | map(.custom_field_options |= from_entries) | group_by(.custom_field_options.ticket_fields) | map(map( .custom_field_options |= to_entries))' without success. Any assistance would be greatly appreciated!
You need two nested iterations, one in each array. Save the value of .id in a variable to access it later.
jq -r '
.ticket_fields[] | select(.type=="tagger") | .id as $id
| .custom_field_options[] | [$id, .name, .value]
| #tsv
'

jq: list users belonging to a specific group in array

input json:
[
{
"user": "u1"
},
{
"user": "u2",
"groups": [
{
"id": "100001",
"name": "G1"
},
{
"id": "100002",
"name": "G2"
}
]
},
{
"user": "u3",
"groups": [
{
"id": "100001",
"name": "G1"
}
]
}
]
I want to find all users belonging to specific group (searching by group name or group id in the groups array)
$ jq -r '.[]|select(.groups[].name=="G1" | .user)' json
jq: error (at json:27): Cannot iterate over null (null)
Desired output format when searching of example group G1 would be:
u2
u3
Additional question:
Is it possible to produce comma-separated output u2,u3 without using external utilities like tr?
Better enter your serach data from parameters using --arg and use any to avoid duplicate outputs if both inputs match:
jq -r --arg id "" --arg name "G1" '
.[] | select(.groups | map(.id == $id or .name == $name) | any)? | .user
'
u2
u3
Demo
Using ? as the Optional Object Identifier-Index operator, you could do a select as below
map(select(.groups[].name == "G1")? | .user)
and un-wrap the results from the array by using [] at the end of the filter. To combine multiple selection conditions use the boolean operators with and/or inside the select statement
See demo on jqplay

Extract nested properties from an array of objects

I have the following JSON file :
{
"filter": [
{
"id": "id_1",
"criteria": {
"from": "mail#domain1.com",
"subject": "subject_1"
},
"action": {
"addLabelIds": [
"Label_id_1"
],
"removeLabelIds": [
"INBOX",
"SPAM"
]
}
},
{
"id": "id_2",
"criteria": {
"from": "mail#domain2.com",
"subject": "subject_1"
},
"action": {
"addLabelIds": [
"Label_id_2"
],
"removeLabelIds": [
"INBOX",
"SPAM"
]
}
}
]
}
And I would like to extract emails values : mail#domain1.com and mail#domain2.com
I have tried this command:
jq --raw-output '.filter[] | select(.criteria.from | test("mail"; "i")) | .id'
But does not work, I get this error :
jq: error (at <stdin>:1206): null (null) cannot be matched, as it is
not a string exit status 5
Another point : how to display the value of "id" key, where "from" key value = mail#domain1.com ?
So in my file id = id_1
Do you have an idea ?
If you only need to extract the emails from .criteria.from then this filter is enough as far as I can tell:
jq --raw-output '.filter[].criteria.from' file.json
If some objects don't have a criteria object then you can filter out nulls with:
jq --raw-output '.filter[].criteria.from | select(. != null)' file.json
If you want to keep the emails equal to "mail#domain1.com":
jq --raw-output '.filter[].criteria.from | select(. == "mail#domain1.com")' file.json
If you want to keep the emails that start with "mail#":
jq --raw-output '.filter[].criteria.from | select(. != null) | select(startswith("mail#"))' file.json
I would like to extract emails values
There is a wide spectrum of possible answers, with these
amongst the least specific with respect to where in the JSON the email addresses occur:
.. | objects | .from | select(type=="string")
.. | strings | select(test("#([a-z0-9]+[.])+[a-z]+$"))

Combine the value from a key with all array entries

I have json input as follows:
[{
"a": "123",
"b": [
"xyz",
"uvw"
]
}, {
"a": "456",
"b": [
"ghi"
]
}]
and I'd like to produce a list where each object's "a" is combined with each element of "b" using a delimiter. Is this possible to do using jq?
123|xyz
123|uvw
456|ghi
You can change the delimiter on the fly if you parameterize it.
$ jq -r --arg delim '|' '.[] | "\(.a)\($delim)\(.b[])"' input.json

How do I select multiple fields in jq?

My input file looks something like this:
{
"login": "dmaxfield",
"id": 7449977,
...
}
{
"login": "dmaxfield",
"id": 7449977,
...
}
I can get all the login names with this : cat members | jq '.[].login'
but I have not been able to crack the syntax to get both the login and id?
You can use jq '.[] | .login, .id' to obtain each login followed by its id.
This works for me:
> echo '{"a":1,"b":2,"c":3}{"a":1,"b":2,"c":3}' | jq '{a,b}'
{
"a": 1,
"b": 2
}
{
"a": 1,
"b": 2
}
Just provide one more example here (jq-1.6):
Walk through an array and select a field of an object element and a field of object in that object
echo '[{"id":1, "private_info": {"name": "Ivy", "age": 18}}, {"id":2, "private_info": {"name": "Tommy", "aga": 18}}]' | jq ".[] | {id: .id, name: .private_info.name}" -
{
"id": 1,
"name": "Ivy"
}
{
"id": 2,
"name": "Tommy"
}
Without the example data:
jq ".[] | {id, name: .private_info.name}" -
.[]: walk through an array
{id, name: .private_info.name}: take .id and .private_info.name and wrap it into an object with field name "id" and "name" respectively
In order to select values which are indented to different levels (i.e. both first and second level), you might use the following:
echo '[{"a":{"aa":1,"ab":2},"b":3,"c":4},{"a":{"aa":5,"ab":6},"b":7,"c":8}]' \
| jq '.[]|[.a.aa,.a.ab,.b]'
[
1,
2,
3
]
[
5,
6,
7
]

Resources