How to modify each element of an array in jq - jq

Suppose I have a JSON:
[
{
"title": "Title1",
"reference": [
"123"
]
},
{
"title": "Title2",
"reference": [
"234",
"345"
]
}
]
Id like to modify each element of the reference array so that the reference appears twice. I'd like to achieve:
[
{
"title": "Title1",
"reference": [
"123 is 123"
]
},
{
"title": "Title2",
"reference": [
"234 is 234",
"345 is 345"
]
}
]
I've tried:
jq '.[] | .reference = [("\(.reference[]) is \(.reference[])")]'
but this fails where the array has more than one item:
{
"title": "Title1",
"reference": [
"123 is 123"
]
}
{
"title": "Title2",
"reference": [
"234 is 234",
"345 is 234",
"234 is 345",
"345 is 345"
]
}
How can I modify the above jq to achieve the desired result?
Thanks in advance!

map(.reference |= map(. + " is " + .))
Will change each .reference to be .reference is .reference
[
{
"title": "Title1",
"reference": [
"123 is 123"
]
},
{
"title": "Title2",
"reference": [
"234 is 234",
"345 is 345"
]
}
]
Demo

This should work just fine:
jq '.[].reference[] |= "\(.) is \(.)"'
It replaces every item of the reference arrays with a string which contains itself two times and the word "is"

Related

Merge all objects inside an array that share the same key

I'm trying to deduplicate all objects inside the array results that share the same key id, and merge their path arrays.
JSON input:
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": "/some/path/a"
},
{
"id": "apple1",
"name": "appleName1",
"path": "/some/path/b"
},
{
"id": "apple2",
"name": "appleName2",
"path": "/some/path/c"
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": "/some/path/a"
},
{
"id": "orange1",
"name": "orangeName1",
"path": "/some/path/b"
},
{
"id": "orange2",
"name": "orangeName2",
"path": "/some/path/c"
}
]
}
]
Expected output:
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "apple2",
"name": "appleName2",
"path": [
"/some/path/c"
]
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "orange2",
"name": "orangeName2",
"path": [
"/some/path/c"
]
}
]
}
]
I've managed to get an approximate solution using:
jq '[{type: .[].type, results: .[].results | group_by(.id) | map({id: .[0].id, name: .[0].name, path: (map(.path))})}]'
But my solution produces two additional elements that aren't supposed to be there.
I know there are some similar questions already answered but I didn't manage to get them to work with this example. Any help is appreciated!
You could group_by the .id field, then for each group take the first item and replace its .path field with a map on the .path fields of all group members:
jq 'map(.results |= (group_by(.id) | map(first + {path: map(.path)})))'
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "apple2",
"name": "appleName2",
"path": [
"/some/path/c"
]
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "orange2",
"name": "orangeName2",
"path": [
"/some/path/c"
]
}
]
}
]
Demo

How to merge 2 JSON files including objects and arrays using jq?

I'm using jq to try and merge 2 json files into one unique file.
The result is close to what I was looking for, but not just right.
File 1:
{
"series": "Harry Potter Movie Series",
"writer": "J.K. Rowling",
"movies": [
{
"title": "Harry Potter and the Philosopher's Stone",
"actors": [
{
"names": [
"Emma Watson",
"Other actor"
],
"other": "Some value"
}
]
},
{
"title": "Harry Potter and the Chamber of Secrets",
"actors": [
{
"names": [
"Emma Watson"
],
"other": "Some value"
}
]
}
]
}
File 2:
{
"series": "Harry Potter Movie Series",
"producer": "David Heyman",
"movies": [
{
"title": "Harry Potter and the Philosopher's Stone",
"year": "2001"
},
{
"title": "Harry Potter and the Chamber of Secrets",
"year": "2002"
}
]
}
Expected result:
{
"series": "Harry Potter Movie Series",
"writer": "J.K. Rowling",
"movies": [
{
"title": "Harry Potter and the Philosopher's Stone",
"year": "2001",
"actors": [
{
"names": [
"Emma Watson",
"Other actor"
],
"other": "Some value"
}
]
},
{
"title": "Harry Potter and the Chamber of Secrets",
"year": "2001",
"actors": [
{
"names": [
"Emma Watson"
],
"other": "Some value"
}
]
}
],
"producer": "David Heyman"
}
Best result I've got so far (only arrays with actors are missing):
{
"series": "Harry Potter Movie Series",
"writer": "J.K. Rowling",
"movies": [
{
"title": "Harry Potter and the Philosopher's Stone",
"year": "2001"
},
{
"title": "Harry Potter and the Chamber of Secrets",
"year": "2002"
}
],
"producer": "David Heyman"
}
Using one of the commands below:
jq -s '.[0] * .[1]' file1 file2
jq --slurp 'add' file1 file2
jq '. * input' file1 file2
If I switch order of files I either end up losing 'actors' from file1 or 'year' from file2.
How it should work:
the elements in file 2 will be leading and should replace the matching elements in file 1.
the elements in file 1 that doesn't exist in file 2 (like writer and movies[].actors elements) shouldn't be deleted
the elements in file 2 that doesn't exist yet in file 1 will be added (like producer and movies[].year).
a title is unique and should by default not occur more then once, but if it does remove the duplicates.
I would assume there is a solution to get these movies arrays perfectly merged with jq.
You are looking for a solution that "merges" objects and arrays. For the former you have already found + (or add) for a top-level merge, and * for a recursive merge, but merging arrays (namely the two .movies fields) needs more specification from your end as there is no canonical solution for that.
In a comment you state
.movies[0] always correspond to the same movie in both files
This enables you to use transpose to align the items from both arrays, and then apply object-merging on each pair of corresponding items. If you want to merge deeper arrays as well (e.g. .movies[].actors or .movies[].actors[].names) you need to extend this approach accordingly. Here's a solution using plain add for the merging of the array items as well as of the other top-level fields:
jq -s 'add + {movies: map(.movies) | transpose | map(add)}' file1 file2
{
"series": "Harry Potter Movie Series",
"writer": "J.K. Rowling",
"movies": [
{
"title": "Harry Potter and the Philosopher's Stone",
"actors": [
{
"names": [
"Emma Watson",
"Other actor"
],
"other": "Some value"
}
],
"year": "2001"
},
{
"title": "Harry Potter and the Chamber of Secrets",
"actors": [
{
"names": [
"Emma Watson"
],
"other": "Some value"
}
],
"year": "2002"
}
],
"producer": "David Heyman"
}
Demo

Need help parsing json output with jq for a complex json

For the below JSON, I need the result.id and result.name output using jq for the ones having
authorization.roles[].name == "Supervisor"
What is the command for jq to to that ? For the below json we expect 1231 id and name AAAA alone as output as that only has Supervisor as role
{
"results": [{
"id": "1231",
"name": "AAAA",
"div": {
"id": "AAA",
"name": "DDSAA",
"selfUri": ""
},
"chat": {
"jabberId": "nn"
},
"department": "Shared Services Organization",
"email": "Test#gmail.com",
"primaryContactInfo": [{
"address": "Test#gmail.com",
"mediaType": "EMAIL",
"type": "PRIMARY"
}],
"addresses": [],
"state": "active",
"title": "AAA",
"username": "Test#gmail.com",
"version": 27,
"authorization": {
"roles": [{
"id": "01256689-c5ed-43a5-b370-58522402830d",
"name": "AA"
}, {
"id": "1e65b009-9f8f-4eef-9844-83944002c095",
"name": "BBB"
}, {
"id": "8a19f1ff-40e5-45d2-b758-14550a173323",
"name": "CCC"
}, {
"id": "d02250e2-7071-46bf-885b-43edff2d88a6",
"name": "Supervisor"
}]
}
}, {
"id": "1255",
"name": "BBBB",
"div": {
"id": "AAA",
"name": "DDSAA",
"selfUri": ""
},
"chat": {
"jabberId": "nn"
},
"department": "Shared Services Organization",
"email": "Test#gmail.com",
"primaryContactInfo": [{
"address": "Test#gmail.com",
"mediaType": "EMAIL",
"type": "PRIMARY"
}],
"addresses": [],
"state": "active",
"title": "AAA",
"username": "Test#gmail.com",
"version": 27,
"authorization": {
"roles": [{
"id": "01256689-c5ed-43a5-b370-58522402830d",
"name": "AA"
}, {
"id": "1e65b009-9f8f-4eef-9844-83944002c095",
"name": "BBB"
}, {
"id": "8a19f1ff-40e5-45d2-b758-14550a173323",
"name": "CCC"
}, {
"id": "d02250e2-7071-46bf-885b-43edff2d88a6",
"name": "Tester"
}]
}
}]
}
Don't put commas before closing brackets or curly braces (it's not valid JSON). Your input should look like this:
{
"results": [
{
"id": "1231",
"name": "AAAA",
"div": {
"id": "AAA",
"name": "DDSAA",
"selfUri": ""
},
"chat": {
"jabberId": "nn"
},
"department": "Shared Services Organization",
"email": "Test#gmail.com",
"primaryContactInfo": [
{
"address": "Test#gmail.com",
"mediaType": "EMAIL",
"type": "PRIMARY"
}
],
"addresses": [],
"state": "active",
"title": "AAA",
"username": "Test#gmail.com",
"version": 27,
"authorization": {
"roles": [
{
"id": "01256689-c5ed-43a5-b370-58522402830d",
"name": "AA"
},
{
"id": "1e65b009-9f8f-4eef-9844-83944002c095",
"name": "BBB"
},
{
"id": "8a19f1ff-40e5-45d2-b758-14550a173323",
"name": "CCC"
},
{
"id": "d02250e2-7071-46bf-885b-43edff2d88a6",
"name": "Supervisor"
}
]
}
},
{
"id": "1255",
"name": "BBBB",
"div": {
"id": "AAA",
"name": "DDSAA",
"selfUri": ""
},
"chat": {
"jabberId": "nn"
},
"department": "Shared Services Organization",
"email": "Test#gmail.com",
"primaryContactInfo": [
{
"address": "Test#gmail.com",
"mediaType": "EMAIL",
"type": "PRIMARY"
}
],
"addresses": [],
"state": "active",
"title": "AAA",
"username": "Test#gmail.com",
"version": 27,
"authorization": {
"roles": [
{
"id": "01256689-c5ed-43a5-b370-58522402830d",
"name": "AA"
},
{
"id": "1e65b009-9f8f-4eef-9844-83944002c095",
"name": "BBB"
},
{
"id": "8a19f1ff-40e5-45d2-b758-14550a173323",
"name": "CCC"
},
{
"id": "d02250e2-7071-46bf-885b-43edff2d88a6",
"name": "Tester"
}
]
}
}
]
}
Then, you can use select to narrow down your target objects (here using any to check if at least one of the role names matches your string -- thx #ikegami), then output any part of the resulting object(s):
jq '
.results[]
| select(any(.authorization.roles[]; .name == "Supervisor"))
| {id, name}
'
{
"id": "1231",
"name": "AAAA"
}
Demo
If instead of a JSON output you need raw text, use the -r (or --raw-output) flag, and provide the fields you are interested in:
jq -r '
.results[]
| select(any(.authorization.roles[]; .name == "Supervisor"))
| .id, .name
'
1231
AAAA
Demo

How to convert JSON data to tidy format in R

I never have worked with json data in R and unfortunately, I was sent a sample of data as:
{
"task_id": "104",
"status": "succeeded",
"metrics": {
"requests_made": 2,
"network_errors": 0,
"unique_locations_visited": 0,
"requests_queued": 0,
"queue_items_completed": 2,
"queue_items_waiting": 0,
"issue_events": 9,
"caption": "",
"progress": 100
},
"message": "",
"issue_events": [
{
"id": "1234",
"type": "issue_found",
"issue": {
"name": "policy not enforced",
"type_index": 123456789,
"serial_number": "123456789183923712",
"origin": "https://test.com",
"path": "/robots.txt",
"severity": "low",
"confidence": "certain",
"caption": "/robots.txt",
"evidence": [
{
"type": "FirstOrderEvidence",
"detail": {
"band_flags": [
"in_band"
]
},
"request_response": {
"url": "https://test.com/robots.txt",
"request": [
{
"type": "DataSegment",
"data": "jaghsdjgasdgaskjdgasdgashdgsahdgasjkdgh==",
"length": 313
}
],
"response": [
{
"type": "DataSegment",
"data": "asudasjdgasaaasgdasgaksjdhgasjdgkjghKGKGgKJgKJgKJGKgh==",
"length": 303
}
],
"was_redirect_followed": false,
"request_time": "1234567890"
}
}
],
"internal_data": "jdfhgjhJHkjhdskfhkjhjs0sajkdfhKHKhkj=="
}
},
{
"id": "1235",
"type": "issue_found",
"issue": {
"name": "certificate",
"type_index": 12345845684,
"serial_number": "123456789165637150",
"origin": "https://test.com",
"path": "/",
"severity": "info",
"confidence": "certain",
"description": "The server description a valid, trusted certificate. This issue is purely informational.<br><br>The server presented the following certificates:<br><br><h4>Server certificate</h4><table><tr><td><b>Issued to:</b> </td><td>test.ie, test.com, www.test.com, www.test.ie</td></tr><tr><td><b>Issued by:</b> </td><td>GeoTrust EV RSA CA 2018</td></tr><tr><td><b>Valid from:</b> </td><td>Tue May 12 00:00:00 UTC 2020</td></tr><tr><td><b>Valid to:</b> </td><td>Tue May 17 12:00:00 UTC 2022</td></tr></table><h4>Certificate chain #1</h4><table><tr><td><b>Issued to:</b> </td><td>GeoTrust EV RSA CA 2018</td></tr><tr><td><b>Issued by:</b> </td><td> High Assurance EV Root CA</td></tr><tr><td><b>Valid from:</b> </td><td>Mon Nov 06 12:22:46 UTC 2017</td></tr><tr><td><b>Valid to:</b> </td><td>Sat Nov 06 12:22:46 UTC 2027</td></tr></table><h4>Certificate chain #2</h4><table><tr><td><b>Issued to:</b> </td><td> High Assurance EV Root CA</td></tr><tr><td><b>Issued by:</b> </td><td> High Assurance EV Root CA</td></tr><tr><td><b>Valid from:</b> </td><td>Fri Nov 10 00:00:00 UTC 2006</td></tr><tr><td><b>Valid to:</b> </td><td>Mon Nov 10 00:00:00 UTC 2031</td></tr></table>",
"caption": "/",
"evidence": [],
"internal_data": "sjhdgsajdggJGJHgjfgjhGJHgjhsdgfgjhGJHGjhsdgfjhsgfdsjfg098867hjhgJHGJHG=="
}
},
{
"id": "1236",
"type": "issue_found",
"issue": {
"name": "without flag set",
"type_index": 1254392,
"serial_number": "12345678965616",
"origin": "https://test.com",
"path": "/robots.txt",
"severity": "info",
"confidence": "certain",
"description": "my description text here....",
"caption": "/robots.txt",
"evidence": [
{
"type": "InformationListEvidence",
"request_response": {
"url": "https://test.com/robots.txt",
"request": [
{
"type": "DataSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfh==",
"length": 313
}
],
"response": [
{
"type": "DataSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfh=",
"length": 161
},
{
"type": "HighlightSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdf=",
"length": 119
},
{
"type": "DataSegment",
"data": "AasjkdhasjkhkjHKJSDHFJKSDFHKhjkHSKADJFHKhjkhjkh=",
"length": 23
}
],
"was_redirect_followed": false,
"request_time": "178454751191465"
},
"information_items": [
"Other: user_id"
]
}
],
"internal_data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKH=="
}
},
{
"id": "1237",
"type": "issue_found",
"issue": {
"name": "without flag set",
"type_index": 1234567,
"serial_number": "123456789056704",
"origin": "https://test.com",
"path": "/",
"severity": "info",
"confidence": "certain",
"description": "long description here zjkhasdjkh hsajkdhsajkd hasjkdhbsjkdash d",
"caption": "/",
"evidence": [
{
"type": "InformationListEvidence",
"request_response": {
"url": "https://test.com/",
"request": [
{
"type": "DataSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfhsfdsfdsfdsfdsfdsfsdfdsf",
"length": 303
}
],
"response": [
{
"type": "DataSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfh==",
"length": 151
},
{
"type": "HighlightSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfh=",
"length": 119
},
{
"type": "DataSegment",
"data": "sdfdsfsdfSDFSDFdSFDS546SDFSDFDSFG657=",
"length": 23
}
],
"was_redirect_followed": false,
"request_time": "123541191466"
},
"information_items": [
"Other: user_id"
]
}
],
"internal_data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsd=="
}
},
{
"id": "1238",
"type": "issue_found",
"issue": {
"name": "parameter pollution",
"type_index": 4137000,
"serial_number": "123456789810290176",
"origin": "https://test.com",
"path": "/robots.txt",
"severity": "low",
"confidence": "firm",
"description": "very long description text here...",
"caption": "/robots.txt [URL path filename]",
"evidence": [
{
"type": "FirstOrderEvidence",
"detail": {
"payload": {
"bytes": "Q3jkeiZkcmg8MQ==",
"flags": 0
},
"band_flags": [
"in_band"
]
},
"request_response": {
"url": "https://test.com/%3fhdz%26drh%3d1",
"request": [
{
"type": "DataSegment",
"data": "W1QOIC8=",
"length": 5
},
{
"type": "HighlightSegment",
"data": "WRMnBGR6JTI2ZHJoJTNkMQ==",
"length": 16
},
{
"type": "DataSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfhcvxxcvklxcvjkxclvjxclkvjxcklvjlxckjvlxckjvklxcjvxcklvjxcklvjxckljvlxckjvxcklvjxckljvxcklvjcklxjvcxkl==",
"length": 298
}
],
"response": [
{
"type": "DataSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfh==",
"length": 130
},
{
"type": "HighlightSegment",
"data": "Q4jleiZkcmg9MQ==",
"length": 10
},
{
"type": "DataSegment",
"data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfh==",
"length": 163
}
],
"was_redirect_followed": false,
"request_time": "51"
}
}
],
"internal_data": "adjkhajksdhaskjdhkjHKJHjkhaskjdhkjasdhKHKJHkjsdhfkjsdhfkjsdhKHJKHjksdfhsdjkfhksdjhKHKJHJKhsdkfjhsdkfjhKHJKHjksdkfjhsdkjfhKHKJHjkhsdkfjhsdkjfhsdjkfhksdjfhKJHKjksdhfsdjkfhksdjfhsdkjhKHJKhsdkfhsdkjfhsdkfhdskjhKHKjhsdfkjhsdjkfh="
}
}
],
"event_logs": [],
"audit_items": []
}
I read it in R using jsonlite:
df_orig <- fromJSON('dast_sample_output.json', flatten= T)
This gives a nested list type R object. I wish to convert this list to a data frame in a tidy format with all the arrays and sub arrays being unnested.
If you run the str(df_orig), you could see the nested data frames in there.
How do I convert it to tidy format?
I tried unnest(), purrr but struggling to get into the tidy format for analysis? Any pointers would be highly appreciated.
Cheers,
use the jsonlite package function fromJSON()
edit:
set option flatten=T
edit2:
use content( x, 'text') before flattening
here is a full example converting to data.table:
get.json <- GET( apicall.text )
get.json.text <- content( get.json , 'text')
get.json.flat <- fromJSON( get.json.text , flatten = T)
dt <- as.data.table( get.json.flat )

Using jq, how can I limit values based on a key

For an input file that looks like this:
{
"employees": [
{
"number": "101",
"tags": [
{
"value": "yes",
"key": "management"
},
{
"value": "joe",
"key": "login"
},
{
"value": "joe blogs",
"key": "name"
}
]
},
{
"number": "102",
"tags": [
{
"value": "no",
"key": "management"
},
{
"value": "jane",
"key": "login"
},
{
"value": "jane doe",
"key": "name"
}
]
},
{
"number": "103",
"tags": [
{
"value": "no",
"key": "management"
},
{
"value": "john",
"key": "login"
},
{
"value": "john doe",
"key": "name"
}
]
}
]
}
... I'd like to get details for all non-management employees so that the desired output looks like this:
{
"number": "102",
"name": "jane doe",
"login": "jane"
}
{
"number": "103",
"name": "john doe",
"login": "john"
}
I can't figure out how to limit results based on a key without selecting that key (in this case "management")
The following is a slightly more succinct solution:
.employees[]
| .tags |= from_entries
| select(.tags.management == "no")
| {number, "name": .tags.name, "login": .tags.login}
Using from_entries, this worked for me:
$ jq '.employees[] | {number: .number, tags: .tags | from_entries} | select(.tags.management=="no") | {number: .number, name: .tags.name, login: .tags.login}' input
... and the output is:
{
"number": "102",
"name": "jane blogs",
"login": "jane"
}
{
"number": "103",
"name": "john doe",
"login": "john"
}
There may be a better way to achieve what I wanted, so I'll leave the question open for a while if someone wants to offer a better solution.
Here is another solution which uses from_entries
.employees[]
| {number} + (.tags | from_entries)
| if .management == "no" then {number, name, login} else empty end

Resources