I have an array of objects of various types which reference one another with UUIDs (a terraform.tfstate file). I'd like to select one value from one such object based on the appearance of a different value in another object, where the two objects are related by one of those UUIDs.
By way of example, I can do this:
$ jq '.modules[].resources[]
| select(.type == "openstack_compute_instance_v2" and
.primary.attributes.name == "jumpbox").primary.id' terraform.tfstate
"5edfe2bf-94df-49d5-8118-3e91fb52946b"
$ jq '.modules[].resources[]
| select(.type =="openstack_compute_floatingip_associate_v2" and
.primary.attributes.instance_id == "5edfe2bf-94df-49d5-8118-3e91fb52946b").primary.attributes.floating_ip' terraform.tfstate
"10.120.241.21"
Giving me the external floating IP of the 'jumpbox' VM based on its name.
I'd like to make that all one jq call. Is that possible?
This would be easier to answer if you provided more sample data but
working backwards from your commands (with some reformatting)
$ jq '
.modules[].resources[]
| select(.type == "openstack_compute_instance_v2" and .primary.attributes.name == "jumpbox")
| .primary.id
' terraform.tfstate
"5edfe2bf-94df-49d5-8118-3e91fb52946b"
$ jq '
.modules[].resources[]
| select(.type =="openstack_compute_floatingip_associate_v2" and .primary.attributes.instance_id == "5edfe2bf-94df-49d5-8118-3e91fb52946b")
| .primary.attributes.floating_ip
' terraform.tfstate
"10.120.241.21"
we can infer you have data which looks like
{
"modules": [
{
"resources": [
{
"type": "openstack_compute_instance_v2",
"primary": {
"id": "5edfe2bf-94df-49d5-8118-3e91fb52946b",
"attributes": {
"name": "jumpbox"
}
}
},
{
"type": "openstack_compute_floatingip_associate_v2",
"primary": {
"attributes": {
"instance_id": "5edfe2bf-94df-49d5-8118-3e91fb52946b",
"floating_ip": "10.120.241.21"
}
}
}
]
}
]
}
The following filter demonstrates a solution using functions, variables and parenthesis ():
def get_primary_id($name):
select(.type == "openstack_compute_instance_v2"
and .primary.attributes.name == $name)
| .primary.id
;
def get_floating_ip($id):
select(.type =="openstack_compute_floatingip_associate_v2"
and .primary.attributes.instance_id == $id)
| .primary.attributes.floating_ip
;
.modules[]
| ( .resources[] | get_primary_id("jumpbox") ) as $id
| ( .resources[] | get_floating_ip($id) ) as $fip
| ($id, $fip)
if this filter is in filter.jq and data.json contains the sample data above
then
$ jq -M -f filter.jq data.json
produces the output:
"5edfe2bf-94df-49d5-8118-3e91fb52946b"
"10.120.241.21"
Related
I have json with random elements on array:
[
{
"system": {
"name": "sys1",
"interfaces": [
{
"ip": "1.1.1.1",
"ent": "ent1"
},
{
"ip": "2.2.2.2",
"ent": "ent0"
}
]
}
},
{
"system": {
"name": "sys2",
"interfaces": [
{
"ip": "3.3.3.3",
"ent": "ent0"
}
]
}
},
{
"system": {
"name": "sys3",
"interfaces": null
}
},
{
"system": {
"name": "sys4"
}
}
]
I need get following output with jq:
sys1 1.1.1.1 ent1
sys1 2.2.2.2 ent0
sys2 3.3.3.3 ent0
I tried following filter:
$ jq -r '.[]|[.system.name, .system.interfaces[].ip, .system.interfaces[].ent]|#tsv' test_json2
sys1 1.1.1.1 2.2.2.2 ent1 ent0
sys2 3.3.3.3 ent0
How to split line 1 to achieve expected result?
Update: I met new case when array is null and I get now following error using filrer from pmf's answer:
jq: error (at test_json2:34): Cannot iterate over null (null)
Iterate outside the array which contains the .name. That way, another array is generated for each iteration step.
jq -r '.[].system | [.name] + (.interfaces[]? | [.[]]) | #tsv' test_json2
Demo
If the objects in the .interfaces array can have more than just those two field, but you only want to output said two fields, name them explicitly.
jq -r '.[].system | [.name] + (.interfaces[]? | [.ip, .ent]) | #tsv' test_json2
Demo
Output is:
sys1 1.1.1.1 ent1
sys1 2.2.2.2 ent0
sys2 3.3.3.3 ent0
You can use array to group each interface.
.[] | .system | select(.name=="sys1" or .name=="sys2") | [.name] + (.interfaces[] | [.[]]) | #tsv
Demo
https://jqplay.org/s/iucSZqkB1r
Note that .[] on an object returns its values.
I was checking the jq tutorial at https://programminghistorian.org/en/lessons/json-and-jq
It makes some json reshaping, extracting some data from a json file, found at https://programminghistorian.org/assets/jq_twitter.json
At some point it makes a group_by, grouping data with the same user, extracting some user data and adding its corresponding tweet ids with the command
jq -s '. | group_by(.user) | .[] | {user_id: .[0].user.id, user_name: .[0].user.screen_name, user_followers: .[0].user.followers_count, tweet_ids: [.[].id]}'
so far, so good... the response looks like this (just a part is extracted):
{
"user_id": 18270633,
"user_name": "ahhthatswhy",
"user_followers": 559,
"tweet_ids": [
501064204661850100
]
}
{
"user_id": 27202261,
"user_name": "Dushan41",
"user_followers": 1201,
"tweet_ids": [
619172281751711700,
619172321564098600
]
}
{
"user_id": 2500422674,
"user_name": "pecanEgba74318",
"user_followers": 17,
"tweet_ids": [
619172331592773600
]
}
But then I would like to add a {"multiple_tweets": true} to all the objects that have more than one tweet_ids.
If I plainly pipe, like this, it works fine:
jq -s '. | group_by(.user) | .[] | {user_id: .[0].user.id, user_name: .[0].user.screen_name, user_followers: .[0].user.followers_count, tweet_ids: [.[].id]} | (select(.tweet_ids | length > 1) .multiple_tweets = true)'
a part of the result:
{
"user_id": 1653718716,
"user_name": "OAnnie8",
"user_followers": 315,
"tweet_ids": [
501064215160172540
]
}
{
"user_id": 356854246,
"user_name": "DrJLMooreIII",
"user_followers": 4888,
"tweet_ids": [
501064202904404000,
501064231387947000
],
"multiple_tweets": true
}
{
"user_id": 117155917,
"user_name": "rebekahwsm",
"user_followers": 5069,
"tweet_ids": [
501064233186893800
]
}
But if (for whatever reason, in this example is not really needed, in fact I was doing it just to understand the update-assignment) I want to use the |= operator,
jq -s '. | group_by(.user) | .[] | {user_id: .[0].user.id, user_name: .[0].user.screen_name, user_followers: .[0].user.followers_count, tweet_ids: [.[].id]} |= (select(.tweet_ids | length > 1) .multiple_tweets = true)'
I get the error ' jq: error (at :30259): Invalid path expression with result {"user_id":1330235048,"use... '
Now the thing that I really can't understand. If instead of using the operator |= directly, I pipe through the identity operator first, it works fine.
What is the reason of this behaviour? Why does |.|= behave differently than |= ?
Why does this change anything?
jq -s '. | group_by(.user) | .[] | {user_id: .[0].user.id, user_name: .[0].user.screen_name, user_followers: .[0].user.followers_count, tweet_ids: [.[].id]} | . |= (select(.tweet_ids | length > 1) .multiple_tweets = true)'
I guess I'm still not understanding how the |= operator really works.
Thank you for your help.
JQ manual explains that behavior as follows:
The left-hand side can be any general path expression; see path().
Note that the left-hand side of |= refers to a value in .. Thus $var.foo |= . + 1 won't work as expected ($var.foo is not a valid or useful path expression in .); use $var | .foo |= . + 1 instead.
Since the underlying builtin (_modify) is implemented using setpath, getpath, and delpaths; the LHS of |= must be a valid path expression that can be represented as an array; in other words, path(LHS) must not fail. See below examples.
$ jq -n 'path(1)'
jq: error (at <unknown>): Invalid path expression with result 1
$ jq -n '1 |= . + 1'
jq: error (at <unknown>): Invalid path expression with result 1
$ jq -n '1 | path(.)'
[]
$ jq -n '1 | . |= . + 1'
2
Given the json
{ "games": [
{
"id":1,
"files": [ "foo.mp4" ]
},
{
"id":2,
"files": [ "foo.ogv", "bar.ogv" ]
},
{
"id":3,
"files": [ "bar.ogv" ]
}
]}
and the command
jq -r '.games[] | select(.files[] | contains("ogv"))' foo.json, json outputs an element once for every time it matches ogv in the subelement array. How do I get jq to output each matching element only once?
Using any would be more efficient than relying on unique. E.g.
jq -r '.games[] | select(any(.files[]; test("ogv")))'
jq -r '[.games[] | select(.files[] | contains("ogv"))] | unique | .[]' foo.json
or, since what I really want is just the id,
jq -r '[.games[] | select(.files[] | contains("ogv")) | .id] | unique | .[]' foo.json
I'm trying to use jq to filter my results when the value contains quote literals so my data looks like:
{"key": "site=\"abc\""}
I want to filter using contains (or some other method) for where site=abc but not site=abc123
current code that gets abc and abc123:
jq -c '.textPayload | select(contains("abc"))' test.json
I attempted to try to escape using \ but it looks like it doesn't work in the contains method?
Consider:
$ echo '{"key": "site=\"abc\""}' | jq 'select(.key | contains("\"abc\""))'
{
"key": "site=\"abc\""
}
$ echo '{"key": "site=\"abc\""}' | jq 'select(.key | index("\"abc\""))'
{
"key": "site=\"abc\""
}
$ echo '{"key": "site=\"abc\""}' | jq 'select(.key | test("\"abc\""))'
{
"key": "site=\"abc\""
}
So it's unclear what the difficulty is.
I need help in correcting jq test cases syntax. Following is output file & trying to test ID list with command below. Gives error index to string type.
[[ $(echo $output| jq -r '.output.value[] | select(.identity).id_list') == *"id2"* ]]
output = {
"resource_output": {
"value": {
"identity": [
{
"id_list": [
"/subscriptions/---/id1",
"/subscriptions/---/id2",
"/subscriptions/--/id3"
],
"principal_id": "",
"tenant_id": "",
"type": "managed"
}
]
}
}
Your query does not match the sample JSON, and you have not indicated what output you are expecting, but the following variation of your query illustrates how to use select and test with your data along the lines suggested by your attempt:
echo "$output" |
jq -r '.resource_output.identity[].id_list[] | select(test("id2"))'
Output:
/subscriptions/---/id2