jq select input based on values in array

jq select input based on values in array - jq

I have a list of country codes like FR, IT, DE and have been trying to figure out how to use this in a select statement. I was doing something like
cat stuff | jq -c '.[]| select(.country_iso3166_alpha2 == "US")'
But then my list grew to a large number of countries I want to match on. So I tried using IN since I'm using jq 1.6 and did something like this:
eu=("FR", "IT"); cat stuff | jq -c '.[]| select(.country_iso3166_alpha2 IN($eu)'
I've been reading the docs and looking at the cookbook but it's not making any sense to me. Thanks!

You can use --argjson to pass the list to jq and IN to select the matching entries.
jq -c --argjson eu '["FR", "IT"]' '.[]| select(.country_iso3166_alpha2 | IN($eu[]))' <stuff
Broken out to show the individual parts:
jq -c \
--argjson eu '["FR", "IT"]' \
'.[]| select(.country_iso3166_alpha2 | IN($eu[]))' \
<stuff
invoke jq with compact output
pass in the list of countries as a json array named "eu"
select using the IN operator, unpacking $eu[] to get its values
redirect the input file into jq

Unfortunately, jq does not understand bash arrays.
Things would probably be simplest if you can arrange to have your shell variable be a string representing a JSON array:
eu='["FR", "IT"]'
jq -n --argjson eu "$eu" '$eu'
Or if you prefer:
eu='["FR", "IT"]' jq -n 'env.eu | fromjson'
Another possibility would be to write your bash array into a file, and then slurp the file.
There are also many variants of the above....

Related

How can I summarize a property across multiple files with jq?

I have a number of JSON files that look like this:
{
"property": "value1",
...
}
What I want is an output file that looks like this:
{
"<filename1>": "<value1>",
"<filename2>": "<value2>",
"<filename3>": "<value3>",
...
}
This can be achieved with two jq invocations and a shell pipe:
jq '{(input_filename):.property}' * | jq -s add
However, I was wondering whether this is possible with a single jq invocation (or any other simpler way).
I'm currently using jq version 1.5-1 in case it's relevant.

Use inputs in combination with the -n option to sequentially access all input files.
In direct analogy, you could just create the array that would have been created by the -s option using [inputs], and then add up the items as you did before:
jq -n '[inputs | {(input_filename): .property}] | add' *
But in a more straightforward way, you could employ reduce to iteratively build up your result object:
jq -n 'reduce inputs as $in ({}; .[input_filename] = $in.property)' *

How to extract individual values from output aws cli --query

I'm running this query with zsh:
output=$(aws sagemaker describe-training-job \
--training-job-name $name \
--query '{S3ModelArtifacts:ModelArtifacts.S3ModelArtifacts,TrainingImage:AlgorithmSpecification.TrainingImage,RoleArn:RoleArn}')
But for the life of me I can't seem to individually extract out S3ModelArtifacts, TrainingImage, and RoleArn.
It seems to be neither an array nor an associative array? But it looks like it's json format when I do echo $output.
Ultimately I just want to be able to do something like var=${output[TrainingImage]} but this just gives me the whole response instead of just the TrainingImage value.
Any help appreciated.

You can use the command line tool jq to parse json output like so:
(19-11-27 10:25:38) <0> [~] printf %s "$output" | jq '.TrainingImage'
"123456789877.dkr.ecr.eu-west-1.amazonaws.com/kmeans:1"
Or, as this is a pretty simple query, you can use sed:
(19-11-27 10:25:43) <0> [~] printf %s "$output" | sed -n -e 's/^.*TrainingImage"://p'
"123456789877.dkr.ecr.eu-west-1.amazonaws.com/kmeans:1",
Here is the explanation of the sed command.

jq in CLI create error when I want to parse the output

Using Home Assistant 0.92 to test my CLI for creating automated backuping. After a successful backup, the command responds with an output and I need to catch that value. I'm trying to use jq to parse it but only get an error.
$ hassio snapshots new --name"Testbackup"
This gives an output of slug: 07afd144 and I want to catch 07afd144
Tried following:
$ hassio snapshots new --name"Testbackup" | jq --raw-output '.data.slug'
This gives an output of parse error: Invalid numeric literal at line 1, column 5
The final result is planned to be:
slug=$(hassio snapshots new --name="${name}" | jq --raw-output '.data.slug')
where ${slug}=07afd144
What am I doing wrong?

jq is a tool for parsing and transforming JSON documents. What you have shown is not legal JSON. It is however a legal YAML document and can be transformed with yq. yq uses jq-like syntax, but can handle JSON, YAML, XML, and CSV files.
slug=$(hassio snapshots new --name="${name}" | yq '.slug')

slug: 07afd144 isn't valid JSON and as such cannot be parsed with jq. Furthermore, it doesn't contain a data property anywhere, so .data.slug doesn't make sense.
If the format is always this simple (property name, colon, space, value), then the value can be easily extracted with other common tools generally available on GNU+Linux systems:
cut (different invocations possible):
cut -d' ' -f2-
cut -c7-
awk:
awk '{print $2}'
sed:
sed 's/^slug: //'
`perl:
perl -lane 'print $F[0]'
or even grep (different invocations possible):
grep -o '[^ ]*$'
grep -o '[[:xdigit:]]*$'

Proper syntax for jq while filtering github commit

I am struggling with the proper jq syntax for pulling out all the names from a curl call that looks like this:
my #repo = curl -s -R -D $tmp_fh_header -u $o_user:$o_auth https://api.github.com/repos/mycompany/$repo/commits| jq '.[].login';
In my case, it should report 10 names, but it only comes back with 5 nulls.
Can you point me in the right direction?

If you need the names for the author of the last commits, you can use:
curl -s https://api.github.com/repos/mycompany/$repo/commits | jq '.[].author.login'
If you need the names for the committer of the last commits, you can use:
curl -s https://api.github.com/repos/mycompany/$repo/commits | jq '.[].committer.login'
or if you need the unique names for that last call:
curl -s https://api.github.com/repos/mycompany/$repo/commits | jq '[.[].committer.login] | unique'
You can use pagination for more/less results as documented in https://developer.github.com/v3/#pagination

unix awk repeated pattern

I have the following variable. i want to search with pattern "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/"
export str='16/02/02 11:29:22 INFO mortbay.log: State being saved: {"#class":"com.paypal.fpti.hadoop.copy.FPTICopyState","timestamp":0,"state":"Running","name":"com.paypal.fpti.hadoop.copy.FPTICopyState","id":"99c7cba7-d211-4845-97a1-c34168a91b22","subStates":{"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-4_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"99034acb-cfad-41a0-89ed-e2731b1f82ec","subStates":null,"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-4","sourceDir":"/fpti/v2/hdfs_writer_4//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"40325dec-0fe2-4025-8258-f896f957ddf0","subStates":null,"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data","sourceDir":"/fpti/v2/hdfs_writer//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-1_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"5216f8c1-2cfa-4eac-a390-f4d2bcd6584f","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-1","sourceDir":"/fpti/v2/hdfs_writer_1//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-2_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"5fcd0b6e-3df9-4f82-a76f-bc8ff1493623","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-2","sourceDir":"/fpti/v2/hdfs_writer_2//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-3_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"6ec9223a-fcf0-447a-b9ae-2020e3232f6d","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-3","sourceDir":"/fpti/v2/hdfs_writer_3//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-5_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"d123742c-8a55-4e25-bfa0-0a97f6ed25d7","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-5","sourceDir":"/fpti/v2/hdfs_writer_5//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//"}},"copystate":"CopyToLocalDone","start":"2016-02-02T11:21:24.678Z","end":null,"window":"2016-02-02T10:00:00.000Z","retryCount":0}'
I tried like below it gives the first occurence alone
[ggangadharan#phxbastion2 ~]$ echo $str | awk '{match($0, "/x/home[/,a-z,0-9,_]+*", a)}END{print a[0]}'
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
but i want output like below.
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//
Can somebody help me how to use awk for this scenario?
thanks in advance

I'm not sure how to hack this in awk, but you can safely use egrep here:
$ echo $str | egrep -o /x/home[/,a-z,0-9,_]+*
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//

Using "significant splitting" in AWK:
$ awk -v RS="\"" '/\/x\/home\/pp_dt_fpti_batch\/stampy_copy_orchestration\//' <<< "$str"
which gives
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//
You specified /x/home/pp_dt_fpti_batch/stampy_copy_orchestration/ for your search pattern, so I used that. If you want something different, use something different.
This separates input into records by a quote " (set RS to ", escaped in the shell). Any record matching the regular expression is printed. Input is given from the shell with the string $str. Maybe this is more readable:
$ awk -v RS='"' '/regexp/' <<< "$str"

Here are two approaches using a JSON-aware command-line tool, here jq.
In both cases we assume that the string of interest is embedded in the
JSON object contained in $str
(1) In the following, we simply pretty-print the JSON object and grep for
the string of interest in case it appears in a surprising spot. Further trimming of the result can easily be done (e.g. using sed) as desired:
$ sed 's/^[^{]*//' <<< "$str" | jq '.[]' | fgrep /x/home/pp_dt_fpti_batch/stampy_copy_orchestration/
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//"
(2) The following query is appropriate if we are only interested in a
match if it occurs in an object as a value corresponding to the key "localDir":
sed 's/^[^{]*//' <<< "$str" |
jq -r '..
| select(.localDir?)
| .localDir
| select(test("/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/"))'
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

jq select input based on values in array - jq

Related

How can I summarize a property across multiple files with jq?

How to extract individual values from output aws cli --query

jq in CLI create error when I want to parse the output

Proper syntax for jq while filtering github commit

unix awk repeated pattern

Categories

Resources