Using `sed` to remove all elements from JSON array - unix

I know StackOverflow is not a code-writing-service, but sed has been driving me crazy for the past 3 hours.
In summary, I need to modify the contents of a .json file that I have.
What the file looks like:
{
// ...
"first": {
"second": [
"indexZero",
"theseStringsAreDynamic",
"soINeedToUseWildcard"
]
}
// ...
}
Desired result:
{
// ...
"first": {
"second": [
]
}
// ...
}
What have you done?
I have tried about a million variations loosely based upon:
sed -i 's/\"second\": \[.*\]/\"second\": []/' "my.json"
## ~ Which gives: ~
#
# "first": {
# "second": []
# "indexZero",
# "theseStringsAreDynamic",
# "soINeedToUseWildcard"
# ]
# },
Essentially, I need to remove all elements from an array in a .json file, so if sed is not the correct tool for the job, please let me know.
Thank you for your time.

The correct tool for the job is jq:
$ jq '.first.second = []' input.json
{
"first": {
"second": []
}
}
To replace the original file, it's a two step process - redirect output to a temporary file and then rename it:
jq '.first.second = []' orig.json > tmp.json && mv -f tmp.json orig.json

Related

How to put specific filenames into a specific JSON format using bash or Perl?

Assuming I'm in the folder like this:
➜ tmp.lDrLPUOF ls
1.txt 2.txt 3.txt 1.zip 2.rb
I want to put all the filenames of text files into a specific JSON format like this:
{
"": [
{
"title": "",
"file": "1"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "3"
}
]
}
Now I just know how to list all the filenames:
➜ tmp.lDrLPUOF ls *'.txt'
1.txt 2.txt 3.txt
Can I use bash or Perl to achieve this purpose? Thank you very much!
Edit
Thanks for #Charles Duffy and #Shawn 's great answers. But it's my fault to forget another important piece of information——time. I want to put the filenames into the resulting JSON per their creating time.
The creating time is as below:
➜ tmp.lDrLPUOF ls -lTr
total 0
-rw-r--r-- 1 administrator staff 0 Oct 12 09:35:05 2022 3.txt
-rw-r--r-- 1 administrator staff 0 Oct 12 09:35:08 2022 2.txt
-rw-r--r-- 1 administrator staff 0 Oct 12 09:35:12 2022 1.txt
So the resulting JSON I wanted should be like this:
{
"": [
{
"title": "",
"file": "3"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "1"
}
]
}
{ shopt -s nullglob; set -- *.txt; printf '%s\0' "$#"; } | jq -Rn '
{"": [ input
| split("\u0000")[]
| select(. != "")
| {"title": "",
"file": . | rtrimstr(".txt")
}
]
}
'
Let's break this down into pieces.
On the bash side:
shopt -s nullglob tells the shell that if *.txt has no arguments, it should emit nothing at all, instead of emitting the string *.txt as a result.
set -- overwrites the argument list in the current context (because this is a block on the left-hand side of the pipeline that context is transient and won't change "$#" in code outside the pipe).
printf '%s\0' "$#" prints our arguments, with a NUL character after each one; if there are no arguments at all, it prints only a NUL.
On the jq side:
-R specifies that the input is raw data, not json.
-n specifies that we don't automatically consume any inputs, but will instead use input or inputs to specify where input should be read.
split("\u0000") splits the input on NULs. (This is important because the NUL is the only character that can't exist in a filename, which is why we used printf '%s\0' on the shell end; that way we work correctly with filenames with newlines, literal quotes, whitespace, and all the other weirdness that's able to exist).
select(. != "") ignores empty strings.
rtrimstr(".txt") removes .txt from the name.
Addendum: Sorting by mtime
The jq parts don't need to be modified here: to sort by mtime you can adjust only the shell. On a system with GNU find, sort and sed, this might look like:
find . -maxdepth 1 -type f -name '*.txt' -printf '%T# %P\0' |
sort -zn |
sed -z -re 's/^[[:digit:].]+ //g' |
jq -Rn '
...followed by the same jq given above.
If installed, tree can be a good alternative to list the contents of directories as it can encode its output as well-defined JSON which comes in handy when dealing with strange file names (and especially when your desired output is JSON anyways).
tree -JtL 1 -P '*.txt'
[
{"type":"directory","name":".","contents":[
{"type":"file","name":"3.txt"},
{"type":"file","name":"2.txt"},
{"type":"file","name":"1.txt"}
]}
,
{"type":"report","directories":0,"files":3}
]
tree -J outputs JSON
tree -t sorts by last modification time
tree -L 1 recurses only 1 level deep
tree -P '*.txt' reduces the the list to file pattern *.txt
Of course, you can also add more details, if needed, such as
tree -p includes file permissions
tree -u and tree -g include user and group names
tree -s includes the file size in bytes
tree -D --timefmt '%F %T' includes the last modification time
tree -JtL 1 -P '*.txt' -pusD --timefmt='%F %T'
[
{"type":"directory","name":".","mode":"0755","prot":"drwxr-xr-x","user":"hustnzj","size":4096,"time":"2022-10-12 09:35:00","contents":[
{"type":"file","name":"3.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":123,"time":"2022-10-12 09:35:05"},
{"type":"file","name":"2.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":456,"time":"2022-10-12 09:35:08"},
{"type":"file","name":"1.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":789,"time":"2022-10-12 09:35:12"}
]}
,
{"type":"report","directories":0,"files":3}
]
A note regarding this comment: tree -t sorts by last modification time. There's also an option tree -c to sort by (and with tree -D to show time as) last status change instead, but there's no dedicated option (I know of) that uses creation/birth times (if supported by the file system).
Then, using that JSON output as input, you can employ jq for further filtering and formatting:
tree … | jq --arg ext '.txt' '
{"": (first.contents | map(
select(.type == "file") | {title: "", file: .name | rtrimstr($ext)}
))}
'
{
"": [
{
"title": "",
"file": "3"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "1"
}
]
}
Demo
Note: This includes the filter select(.type == "file") as tree would also include the names of subdirectories. Drop it if you want them included.
Using just jq, any shell:
$ jq -n --args '{"": [ $ARGS.positional[] | rtrimstr(".txt") | { title: "", file: . } ] }' *.txt
{
"": [
{
"title": "",
"file": "1"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "3"
}
]
}
The filenames passed on the command line (The expansion of *.txt are in the jq variable $ARGS.positional. For each one, remove the .txt extension and use the rest in a object of the desired structure.
Or with a perl one-liner:
$ perl -MJSON::PP -E 'say encode_json({"" => [ map { { title => "", file => s/\.txt$//r } } #ARGV ] })' *.txt
{"":[{"file":"1","title":""},{"title":"","file":"2"},{"file":"3","title":""}]}
My take:
stat -c '%Y:%n' *.txt \
| sort -t: -n \
| cut -d: -f2- \
| xargs basename -s .txt \
| jq -s 'map({title: "", file: tostring}) | {"": .}'

JQ - how to convert CSV with headers to JSON array with nested dictionaries

I got the following CSV sample
key1,key2,key3,key4,key5
val1,val2,val3,val4,val5
Looking for tips how to convert the above structure into the following JSON structure
[
{
"event": "bleep",
"sourcetype": "rats",
"fields": {
"key1":"val1",
"key2":"val2",
"key3":"val3",
"key4":"val4",
"key5":"val5"
}
},
{
"event": "bleep",
"sourcetype": "rats",
"fields": {
"key1":"val1",
"key2":"val2",
"key3":"val3",
"key4":"val4",
"key5":"val5"
}
}
]
Thanks in advance!
Use -R to read the input as raw text
jq -R '
(. / ",") as $keys
| [ inputs / "," | [$keys, .]
| reduce transpose[] as $i (
{event: "bleep", sourcetype: "rats"}; .fields[$i[0]] = $i[1])
]
' input.csv
Demo
Alasql lib can load the data file from server, parse it and put the result to array of JSON objects.
So, this is some example for you:
<script src="alasql.min.js"></script>
<script>
alasql('SELECT * FROM CSV("FileName.csv",{headers:true})',[],function(res){
var dataResult = {items:res};
});
</script>

jq: Use context object as key in query from root

I have a JSON object where the relevant parts are of the form
{
"_meta": {
"hostvars": {
"name_1": {
"ansible_host": "10.0.0.1"
},
"name_2": {
"ansible_host": "10.0.0.2"
},
"name_3": {
"ansible_host": "10.0.0.3"
}
}
},
...
"nodes": {
"hosts": [
"name_1",
"name_2"
]
}
}
(the output of ansible-inventory --list, for reference).
I would like to use jq to produce a list of IPs of the nodes hosts by looking up the names in ._meta.hostvars. In the example, the output should be:
10.0.0.1
10.0.0.2
Note that 10.0.0.3 should not be included because name_3 is not in the .nodes.hosts list. So just doing jq -r '._meta.hostvars[].ansible_host' doesn't work.
I've tried jq '.nodes.hosts[] | ._meta.hostvars[.].ansible_host' but that fails because ._meta doesn't scan from the root after the pipe.
You can store the root in a variable before changing the context:
jq -r '. as $root | .nodes.hosts[] | $root._meta.hostvars[.].ansible_host'
But a better solution is to just inline the "hosts" query:
jq -r '._meta.hostvars[.nodes.hosts[]].ansible_host'

How to remove Cookie value from HAR file before sharing it with another person

It's common to diagnose web app with HAR file especially when seeking technical support from another person. However, HAR file contains sensitive info like request cookies. For example, if following this guide, it could share cookie values to technical support if HAR file is not cleaned up.
So I prefer deleting request cookies before sharing the HAR file.
Is there any simple command to do it?
To indiscriminately delete all request.cookies:
jq 'del(.. | .request?.cookies?)' www.ibm.com.har
As an example, given this JSON file:
$ jq -M . /tmp/hartest.json
{
"one": {
"foo": {
"bar": true,
"baz": 23
}
},
"two": {
"foo": {
"bar": "Hello",
"baz": "World"
}
},
"three": {
"fu": {
"bar": "gone",
"baz": "forgotten"
}
}
}
I can get rid of all 'bar' keys with this command:
jq -M 'del( .. | .bar?)' /tmp/hartest.json
Running the before and after through diff:
$ diff <(jq -M . /tmp/hartest.json) <(jq -M 'del( .. | .bar?)' /tmp/hartest.json )
4d3
< "bar": true,
10d8
< "bar": "Hello",
16d13
< "bar": "gone",
If I just wanted to get rid of '.foo.bar' only, I would use this:
jq -M 'del( .. | .foo?.bar?)' /tmp/hartest.json
which, again, running through diff gives us:
$ diff <(jq -M . /tmp/hartest.json) <(jq -M 'del( .. | .foo?.bar?)' /tmp/hartest.json )
4d3
< "bar": true,
10d8
< "bar": "Hello",

jq syntax help for querying lists output

I need help in correcting jq test cases syntax. Following is output file & trying to test ID list with command below. Gives error index to string type.
[[ $(echo $output| jq -r '.output.value[] | select(.identity).id_list') == *"id2"* ]]
output = {
"resource_output": {
"value": {
"identity": [
{
"id_list": [
"/subscriptions/---/id1",
"/subscriptions/---/id2",
"/subscriptions/--/id3"
],
"principal_id": "",
"tenant_id": "",
"type": "managed"
}
]
}
}
Your query does not match the sample JSON, and you have not indicated what output you are expecting, but the following variation of your query illustrates how to use select and test with your data along the lines suggested by your attempt:
echo "$output" |
jq -r '.resource_output.identity[].id_list[] | select(test("id2"))'
Output:
/subscriptions/---/id2

Resources