Parsing EC2 on-demand pricing from ec2instances.info - jq

I am trying to use curl and jq to parse the AWS EC2 on-demand pricing and construct a JSON map suitable for use in a Terraform module.
The script that I came up with looks like this, but it doesn't seem to be correct:
curl --silent --show-error 'https://raw.githubusercontent.com/powdahound/ec2instances.info/master/www/instances.json' |
jq '.[]
| .instance_type as $instance_type
| (.pricing | keys) as $keys
| [.pricing[].linux.ondemand | .] as $values
| reduce range(0; $keys|length) as $i
({}; . + { ($keys[$i] + "|" + $instance_type): $values[$i] })'
What am I doing wrong? Here is a smaller code sample to illustrate the problem:
curl --silent --show-error 'https://gist.githubusercontent.com/joshuaspence/0904a6ce25f8830d9ae2eac8fc44fc7a/raw/b24600ab2e536556a74f4dbb45e2ddaa432d430e/sample.json' |
jq '.[]
| .instance_type as $instance_type
| (.pricing | keys) as $keys
| [.pricing[].linux.ondemand | .] as $values
| reduce range(0; $keys|length) as $i
({}; . + { ($keys[$i] + "|" + $instance_type): $values[$i] })'
The expected output from the above command is:
{
"ap-south-1|m1.small": "N/A",
"us-east-1|m1.small": "0.061",
"sa-east-1|m1.small": "0.058",
"ap-northeast-2|m1.small": "0.058",
"ap-southeast-2|m1.small": "0.058",
"us-west-2|m1.small": "0.044",
"us-gov-west-1|m1.small": "0.053",
"us-west-1|m1.small": "0.047",
"eu-central-1|m1.small": "N/A",
"eu-west-1|m1.small": "0.047"
}
{
"ap-south-1|m1.medium": "N/A",
"us-east-1|m1.medium": "0.087",
"ap-northeast-1|m1.medium": "0.122",
"sa-east-1|m1.medium": "0.117",
"ap-northeast-2|m1.medium": "N/A",
"ap-southeast-1|m1.medium": "0.117",
"ap-southeast-2|m1.medium": "0.117",
"us-west-2|m1.medium": "0.087",
"us-gov-west-1|m1.medium": "0.106",
"us-west-1|m1.medium": "0.095",
"us-central-1|m1.medium": "N/A",
"us-west-1|m1.medium": "0.095"
}
The actual output is:
{
"ap-northeast-2|m1.small": "N/A",
"ap-south-1|m1.small": "0.061",
"ap-southeast-2|m1.small": "0.058",
"eu-central-1|m1.small": "0.058",
"eu-west-1|m1.small": "0.058",
"sa-east-1|m1.small": "0.044",
"us-east-1|m1.small": "0.053",
"us-gov-west-1|m1.small": "0.047",
"us-west-1|m1.small": "N/A",
"us-west-2|m1.small": "0.047"
}
{
"ap-northeast-1|m1.medium": "N/A",
"ap-northeast-2|m1.medium": "0.087",
"ap-south-1|m1.medium": "0.122",
"ap-southeast-1|m1.medium": "0.117",
"ap-southeast-2|m1.medium": "N/A",
"eu-central-1|m1.medium": "0.117",
"eu-west-1|m1.medium": "0.117",
"sa-east-1|m1.medium": "0.087",
"us-east-1|m1.medium": "0.106",
"us-gov-west-1|m1.medium": "0.095",
"us-west-1|m1.medium": "N/A",
"us-west-2|m1.medium": "0.095"
}

The reason your script provides the wrong output is that JSON objects do not have a specific order by their keys, and jq built-ins are not stable with regards to what this ordering is. This means that when you do (.pricing | keys) and [.pricing[].linux.ondemand | .], the order of the keys in the former does not match the order of the values in the latter.
A simplified and working version of your jq program goes as follows:
jq '.[] | .instance_type as $it | .pricing | with_entries(.key |= "\(.)|\($it)" | .value |= .linux.ondemand)'
This jq program uses with_entries to transform a JSON object into a JSON array of {key, value} objects and perform a transformation on the key-value pairs before reassembling the original object.

Here is a solution which uses keys_unsorted to preserve the ordering of the keys in the original .pricing object.
.[]
| .instance_type as $instance_type
| .pricing
| [
keys_unsorted[] as $k
| .[$k].linux.ondemand
| {("\($k)|\($instance_type)"): .}
]
| add

Related

combine multiple jq commands into one?

I use the following command to extract data from a json file. Suppose that I just keep the first match if there is a match. (If there is no match, an error should be printed.)
.[] | select(.[1] | objects."/Type".N == "/Catalog") | .[1]."/Dests"
Let's say the output is 16876. Then, I use the following jq code to extract the data.
.[] | select(.[0] == 16876) | .[1] | to_entries[] | [.key, .value[0], .value[2].F, .value[3].F] | #tsv
This involves multiple passes of the input json data. Can the two jq commands be combined into one, so that one pass of the input json data is sufficient?
Use as $dests to bind the results of your first query branch to the name $dests, and then refer back to those results in the second branch within a single copy of jq.
(.[] | select(.[1] | objects."/Type".N == "/Catalog") | .[1]."/Dests") as $dests
| .[] | select(.[0] == $dests) | .[1] | to_entries[] | [.key, .value[0], .value[2].F, .value[3].F] | #tsv

jq: filter out IP addresses by regular expression

[
{
"arguments": {
"leases": [
{
"cltt": 1658763299,
"fqdn-fwd": false,
"fqdn-rev": false,
"hostname": "",
"hw-address": "00:aa:bb:cc:dd:ee",
"ip-address": "192.168.0.2",
"state": 0,
"subnet-id": 1,
"valid-lft": 3600
},
{
"cltt": 1658763207,
"fqdn-fwd": false,
"fqdn-rev": false,
"hostname": "",
"hw-address": "00:11:22:33:44:55",
"ip-address": "192.168.1.3",
"state": 0,
"subnet-id": 1,
"valid-lft": 3600
}
]
},
"result": 0,
"text": "2 IPv4 lease(s) found."
}
]
This is a snippet, but in reality there's much more entries. Currently I filter out MAC and IP with jq expression:
jq --raw-output '.[0] | select(.result == 0) | .arguments.leases[] | "\(.["hw-address"]) \(.["ip-address"])"'
Now I'm wondering: does jq have ability to filter out by regexp? For instance I'd like to dump only entries where IP is 192.168.1.*, can it be done with jq? Ideally I'd like to pass regexp to my script as a parameter:
jq --raw-output --arg addr "$1" ...
Would appreciate suggestions on how to do this.
jq has test to test an input against a regular expression:
first
| select(.result == 0)
| .arguments.leases[]
| select(."ip-address"|test("^192\\.168\\.1"))
| "\(."hw-address") \(."ip-address")"
and to provide the regex as argument via command line:
jq -r --arg regex '^192\.168\.1\.' 'first
| select(.result == 0)
| .arguments.leases[]
| select(."ip-address"|test($regex))
| "\(."hw-address") \(."ip-address")"'
If you only want to check the start of the IP address, you could also use startswith: select(."ip-address"|startswith("192.168.1.")):
jq -r --arg prefix '192.168.1.' 'first
| select(.result == 0)
| .arguments.leases[]
| select(."ip-address"|startswith($prefix))
| "\(."hw-address") \(."ip-address")"'
You can use test with regular expressions, and select to filter:
jq -r --arg addr "192\\.168\\.1\\..*" '
.[0] | select(.result == 0) | .arguments.leases[]
| "\(.["hw-address"]) \(.["ip-address"] | select(test($addr)))"
'
00:11:22:33:44:55 192.168.1.3
Demo
Note: 192.168.1.* is not a regular expression (or at least not one you want to use, as it would also match 192.168.100.4 for instance, because . stands for any value; a literal dot has to be escaped)

How to use jq to format array of objects to separated list of key values

How can I (generically) transform the input file below to the output file below, using jq. The record format of the output file is: array_index | key | value
Input file:
[{"a": 1, "b": 10},
{"a": 2, "d": "fred", "e": 30}]
Output File:
0|a|1
0|b|10
1|a|2
1|d|fred
1|e|30
Here's a solution using tostream, which creates a stream of paths and their values. Filter out those having values using select, flatten to align both, and join for the output format:
jq -r 'tostream | select(has(1)) | flatten | join("|")'
0|a|1
0|b|10
1|a|2
1|d|fred
1|e|30
Demo
Or a very similar one using paths to get the paths, scalars for the filter, and getpath for the corresponding value:
jq -r 'paths(scalars) as $p | [$p[], getpath($p)] | join("|")'
0|a|1
0|b|10
1|a|2
1|d|fred
1|e|30
Demo
< file.json jq -r 'to_entries
| .[]
| .key as $k
| ((.value | to_entries )[]
| [$k, .key, .value])
| #csv'
Output:
0,"a",1
0,"b",10
1,"a",2
1,"d","fred"
1,"e",30
You just need to remove the double quotes.
to_entries can be used to loop over the elements of arrays and objects in a way that gives both the key (index) and the value of the element.
jq -r '
to_entries[] |
.key as $id |
.value |
to_entries[] |
[ $id, .key, .value ] |
join("|")
'
Demo on jqplay
Replace join("|") with #csv to get proper CSV.

Azure Application insight Query Merge rows

I have following query:
traces
| where customDimensions.Category == "Function"
| where isnotempty(customDimensions.prop__recordId) or isnotempty(customDimensions.prop__Entity)
| project operation_Id, Entity = customDimensions.prop__Entity, recordName = customDimensions.prop__recordName, recordId = customDimensions.prop__recordId
I get results like these:
I want to merge rows by operation_id, and get results like these:
Please try use join operator, like below:
traces
| where customDimensions.Category == "Function"
| where isnotempty(customDimensions.prop__recordId)
| project operation_Id, customDimensions.prop__recordId
| join kind = inner(
traces
| where customDimensions.Category == "Function"
| where isnotempty(customDimensions.prop__Entity)
| project operation_Id,customDimensions.prop__Entity,customDimensions.prop__recordName
) on operation_Id
| project-away operation_Id1 //remove the redundant column,note that it's operation_Id1
| project operation_Id, Entity = customDimensions.prop__Entity, recordName = customDimensions.prop__recordName, recordId = customDimensions.prop__recordId
I did not has the same data, but make some similar data, works fine at my side.
Before merge:
After merge:(and note that use project-away to remove the redundant column which is used as joined key, and it always has number suffix 1 by default)
Final query is:
| where customDimensions.Category == "Function"
| where isnotempty(customDimensions.prop__recordId)
| project operation_Id, customDimensions.prop__recordId
| join kind = inner(
traces
| where customDimensions.Category == "Function"
| where isnotempty(customDimensions.prop__Entity)
| project operation_Id,customDimensions.prop__Entity
) on operation_Id
| join kind = inner(
traces
| where customDimensions.Category == "Function"
| where isnotempty(customDimensions.prop__recordName)
| project operation_Id,customDimensions.prop__recordName
) on operation_Id
| project operation_Id, Entity = customDimensions_prop__Entity, recordName = customDimensions_prop__recordName, recordId = customDimensions_prop__recordId

jq query with condition and format output/labels

I have a JSON file:
[
{
"platform": "p1",
"id": "5",
"pri": "0",
"sec": "20"
}
]
[
{
"platform": "p2",
"id": "6",
"pri": "10",
"sec": "0"
}
]
I can to format it to the form:
$ jq -c '.[]|{PLATFORM: .platform, ID: .id, PRI: .pri, SEC: .sec}' test.json
{"PLATFORM":"p1","ID":"5","PRI":"0","SEC":"20"}
{"PLATFORM":"p2","ID":"6","PRI":"10","SEC":"0"}
$
but how to ignore SEC/PRI with "0" and get output in form:
PLATFORM:p1, ID:5, SEC:20
PLATFORM:p2, ID:6, PRI:10
I can process it with bash/awk command, but maybe someone have a solution with jq directly.
thank you,
You can use conditional statements to remove the unwanted keys, e.g.:
if (.sec == "0") then del(.sec) else . end
The formatting could be done with #tsv by converting the data to an array, e.g.:
filter.jq
.[] |
if (.sec == "0") then del(.sec) else . end |
if (.pri == "0") then del(.pri) else . end |
to_entries |
map("\(.key | ascii_upcase):\(.value)") |
#tsv
Run it like this:
jq -crf filter.jq test.json
Output:
PLATFORM:p1 ID:5 SEC:20
PLATFORM:p2 ID:6 PRI:10
jq solution:
jq -c 'def del_empty($k): if (.[$k]|tonumber > 0) then . else del(.[$k]) end;
.[] | {PLATFORM: .platform, ID: .id, PRI: .pri, SEC: .sec}
| del_empty("PRI")
| del_empty("SEC")' test.json
The output:
{"PLATFORM":"p1","ID":"5","SEC":"20"}
{"PLATFORM":"p2","ID":"6","PRI":"10"}

Resources