I have a number of JSON files that look like this:
{
"property": "value1",
...
}
What I want is an output file that looks like this:
{
"<filename1>": "<value1>",
"<filename2>": "<value2>",
"<filename3>": "<value3>",
...
}
This can be achieved with two jq invocations and a shell pipe:
jq '{(input_filename):.property}' * | jq -s add
However, I was wondering whether this is possible with a single jq invocation (or any other simpler way).
I'm currently using jq version 1.5-1 in case it's relevant.
Use inputs in combination with the -n option to sequentially access all input files.
In direct analogy, you could just create the array that would have been created by the -s option using [inputs], and then add up the items as you did before:
jq -n '[inputs | {(input_filename): .property}] | add' *
But in a more straightforward way, you could employ reduce to iteratively build up your result object:
jq -n 'reduce inputs as $in ({}; .[input_filename] = $in.property)' *
I'm running this query with zsh:
output=$(aws sagemaker describe-training-job \
--training-job-name $name \
--query '{S3ModelArtifacts:ModelArtifacts.S3ModelArtifacts,TrainingImage:AlgorithmSpecification.TrainingImage,RoleArn:RoleArn}')
But for the life of me I can't seem to individually extract out S3ModelArtifacts, TrainingImage, and RoleArn.
It seems to be neither an array nor an associative array? But it looks like it's json format when I do echo $output.
Ultimately I just want to be able to do something like var=${output[TrainingImage]} but this just gives me the whole response instead of just the TrainingImage value.
Any help appreciated.
You can use the command line tool jq to parse json output like so:
(19-11-27 10:25:38) <0> [~] printf %s "$output" | jq '.TrainingImage'
"123456789877.dkr.ecr.eu-west-1.amazonaws.com/kmeans:1"
Or, as this is a pretty simple query, you can use sed:
(19-11-27 10:25:43) <0> [~] printf %s "$output" | sed -n -e 's/^.*TrainingImage"://p'
"123456789877.dkr.ecr.eu-west-1.amazonaws.com/kmeans:1",
Here is the explanation of the sed command.
Using Home Assistant 0.92 to test my CLI for creating automated backuping. After a successful backup, the command responds with an output and I need to catch that value. I'm trying to use jq to parse it but only get an error.
$ hassio snapshots new --name"Testbackup"
This gives an output of slug: 07afd144 and I want to catch 07afd144
Tried following:
$ hassio snapshots new --name"Testbackup" | jq --raw-output '.data.slug'
This gives an output of parse error: Invalid numeric literal at line 1, column 5
The final result is planned to be:
slug=$(hassio snapshots new --name="${name}" | jq --raw-output '.data.slug')
where ${slug}=07afd144
What am I doing wrong?
jq is a tool for parsing and transforming JSON documents. What you have shown is not legal JSON. It is however a legal YAML document and can be transformed with yq. yq uses jq-like syntax, but can handle JSON, YAML, XML, and CSV files.
slug=$(hassio snapshots new --name="${name}" | yq '.slug')
slug: 07afd144 isn't valid JSON and as such cannot be parsed with jq. Furthermore, it doesn't contain a data property anywhere, so .data.slug doesn't make sense.
If the format is always this simple (property name, colon, space, value), then the value can be easily extracted with other common tools generally available on GNU+Linux systems:
cut (different invocations possible):
cut -d' ' -f2-
cut -c7-
awk:
awk '{print $2}'
sed:
sed 's/^slug: //'
`perl:
perl -lane 'print $F[0]'
or even grep (different invocations possible):
grep -o '[^ ]*$'
grep -o '[[:xdigit:]]*$'
I am struggling with the proper jq syntax for pulling out all the names from a curl call that looks like this:
my #repo = curl -s -R -D $tmp_fh_header -u $o_user:$o_auth https://api.github.com/repos/mycompany/$repo/commits| jq '.[].login';
In my case, it should report 10 names, but it only comes back with 5 nulls.
Can you point me in the right direction?
If you need the names for the author of the last commits, you can use:
curl -s https://api.github.com/repos/mycompany/$repo/commits | jq '.[].author.login'
If you need the names for the committer of the last commits, you can use:
curl -s https://api.github.com/repos/mycompany/$repo/commits | jq '.[].committer.login'
or if you need the unique names for that last call:
curl -s https://api.github.com/repos/mycompany/$repo/commits | jq '[.[].committer.login] | unique'
You can use pagination for more/less results as documented in https://developer.github.com/v3/#pagination
I have the following variable. i want to search with pattern "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/"
export str='16/02/02 11:29:22 INFO mortbay.log: State being saved: {"#class":"com.paypal.fpti.hadoop.copy.FPTICopyState","timestamp":0,"state":"Running","name":"com.paypal.fpti.hadoop.copy.FPTICopyState","id":"99c7cba7-d211-4845-97a1-c34168a91b22","subStates":{"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-4_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"99034acb-cfad-41a0-89ed-e2731b1f82ec","subStates":null,"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-4","sourceDir":"/fpti/v2/hdfs_writer_4//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"40325dec-0fe2-4025-8258-f896f957ddf0","subStates":null,"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data","sourceDir":"/fpti/v2/hdfs_writer//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-1_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"5216f8c1-2cfa-4eac-a390-f4d2bcd6584f","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-1","sourceDir":"/fpti/v2/hdfs_writer_1//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-2_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"5fcd0b6e-3df9-4f82-a76f-bc8ff1493623","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-2","sourceDir":"/fpti/v2/hdfs_writer_2//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-3_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"6ec9223a-fcf0-447a-b9ae-2020e3232f6d","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-3","sourceDir":"/fpti/v2/hdfs_writer_3//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-5_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"d123742c-8a55-4e25-bfa0-0a97f6ed25d7","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-5","sourceDir":"/fpti/v2/hdfs_writer_5//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//"}},"copystate":"CopyToLocalDone","start":"2016-02-02T11:21:24.678Z","end":null,"window":"2016-02-02T10:00:00.000Z","retryCount":0}'
I tried like below it gives the first occurence alone
[ggangadharan#phxbastion2 ~]$ echo $str | awk '{match($0, "/x/home[/,a-z,0-9,_]+*", a)}END{print a[0]}'
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
but i want output like below.
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//
Can somebody help me how to use awk for this scenario?
thanks in advance
I'm not sure how to hack this in awk, but you can safely use egrep here:
$ echo $str | egrep -o /x/home[/,a-z,0-9,_]+*
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//
Using "significant splitting" in AWK:
$ awk -v RS="\"" '/\/x\/home\/pp_dt_fpti_batch\/stampy_copy_orchestration\//' <<< "$str"
which gives
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//
You specified /x/home/pp_dt_fpti_batch/stampy_copy_orchestration/ for your search pattern, so I used that. If you want something different, use something different.
This separates input into records by a quote " (set RS to ", escaped in the shell). Any record matching the regular expression is printed. Input is given from the shell with the string $str. Maybe this is more readable:
$ awk -v RS='"' '/regexp/' <<< "$str"
Here are two approaches using a JSON-aware command-line tool, here jq.
In both cases we assume that the string of interest is embedded in the
JSON object contained in $str
(1) In the following, we simply pretty-print the JSON object and grep for
the string of interest in case it appears in a surprising spot. Further trimming of the result can easily be done (e.g. using sed) as desired:
$ sed 's/^[^{]*//' <<< "$str" | jq '.[]' | fgrep /x/home/pp_dt_fpti_batch/stampy_copy_orchestration/
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//"
(2) The following query is appropriate if we are only interested in a
match if it occurs in an object as a value corresponding to the key "localDir":
sed 's/^[^{]*//' <<< "$str" |
jq -r '..
| select(.localDir?)
| .localDir
| select(test("/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/"))'
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//