What is the JsonPath expression for selecting an object based on sub-object values? - jsonpath

I need to be able to select elements within a JSON document based on the values in sub-elements which, unfortunately, reside in a list of key-value pairs (this is the structure I have to work with). I'm using Jayway 2.4.0.
Here is the JSON document:
{
"topLevelArray": [
{
"elementId": "Elem1",
"keyValuePairs": [
{
"key": "Length",
"value": "10"
},
{
"key": "Width",
"value": "3"
},
{
"key": "Producer",
"value": "alpha"
}
]
},
{
"elementId": "Elem2",
"keyValuePairs": [
{
"key": "Length",
"value": "20"
},
{
"key": "Width",
"value": "8"
},
{
"key": "Producer",
"value": "beta"
}
]
},
{
"elementId": "Elem3",
"keyValuePairs": [
{
"key": "Length",
"value": "15"
},
{
"key": "Width",
"value": "5"
},
{
"key": "Producer",
"value": "beta"
}
]
}
]
}
Here is the JsonPath I thought would do the trick:
$..topLevelArray[ ?( #.keyValuePairs[ ?(#.key=='Producer' && #.value=='beta') ] ) ]
and
$.topLevelArray[ ?( #.keyValuePairs[ ?(#.key=='Producer' && #.value=='beta') ] ) ]
Unfortunately, both are returning everything in the list, including the entry with Producer of 'alpha'. Thx in advance.

Related

How to project values from a Gremlin traversal with nested and()/or() steps

I have the graph model below which represents the sub-pattern I'd like to traverse or fetch. The nodes and their properties are shown below as well.
The expected response to my query would look something like this:
where 's', 'c', 'aid', 'qid', 'p', 'r1', 'r2' are the nodes that make up the subpattern or subgraph.
[
{
"s": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "severity",
"type": "vertex",
"properties": {
"severity": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "High"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"c": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "cve",
"type": "vertex",
"properties": {
"cve_id": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "CVE-xxxx-xxxx"
}
],
"publishedOn": [
{
"id": "fc5dde4d-c027-4c19-9b16-b3314b2b10c6",
"value": "xxx"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"aid": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "aid",
"type": "vertex",
"properties": {
"aid": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxx-xxxx"
}
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"qid": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "qid",
"type": "vertex",
"properties": {
"qid": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxx-xxxx"
}
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"p": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "package",
"type": "vertex",
"properties": {
"name": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxxx"
}
],
"version": [
{
"id": "fc5dde4d-c027-4c19-9b16-b3314b2b10c6",
"value": "xxx"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"r1": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "release",
"type": "vertex",
"properties": {
"source": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxx-xxxx"
}
],
"status": [
{
"id": "fc5dde4d-c027-4c19-9b16-b3314b2b10c6",
"value": "xxx"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
"r2": {
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4",
"label": "release",
"type": "vertex",
"properties": {
"source": [
{
"id": "a6a9e38f-0802-48b6-ac37-490f45e824e9",
"value": "xxxx-xxxx"
}
],
"status": [
{
"id": "fc5dde4d-c027-4c19-9b16-b3314b2b10c6",
"value": "xxx"
}
],
"pk": [
{
"id": "345fbdad-9c67-47bb-9f3b-cf50c8cdbee4|pk",
"value": "pk"
}
]
}
},
{
....
....
},
{
....
..
}
]
My question is how do I build my traversal query to achieve this end result?
What I have so far is this, but the project() step is not working as expected
g.V().hasLabel('cve').as('c').and(
__.in('severity').as('s'),
__.out('cve_to_aid').as('aid').and(
__.out('has_qid').as('qid'),
__.in('package_to_aid').as('p'),
or(
__.in('r1_to_aid').has('status', 'Patched').as('r1'),
__.in('r2_to_aid').has('status', 'Patched').as('r2')
)
)
).project('c', 's', 'aid', 'qid', 'p', 'r1', 'r2').
by(('c').values('cve_id')).
by(('s').values('severity')).
by(('aid').values('aid')).
by(('qid').values('qid')).
by(('p').values()).
by(('r1').values()).
by(('r2').values()).
I am doing this on CosmosDB, so please only provide answers using supported steps found here: https://learn.microsoft.com/en-us/azure/cosmos-db/gremlin/support
It is possible to nest project() steps, e.g. on the TinkerGraph:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).as('x').project('x').by(
select('x').project('id', 'label','properties').by(id).by(label).by(
project('name').by(properties())
)
)
==>[x:[id:1,label:person,properties:[name:vp[name->marko]]]]
gremlin>
but then you end up coding your entire data model into your query.
In full TinkerPop you could turn your result into a subGraph() and write it to graphSon with the io() step. In Cosmos you can add the returned vertices to a TinkerGraph instance clientside and again use the io() step to serialize the TinkerGraph to graphSon.

Merge all objects inside an array that share the same key

I'm trying to deduplicate all objects inside the array results that share the same key id, and merge their path arrays.
JSON input:
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": "/some/path/a"
},
{
"id": "apple1",
"name": "appleName1",
"path": "/some/path/b"
},
{
"id": "apple2",
"name": "appleName2",
"path": "/some/path/c"
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": "/some/path/a"
},
{
"id": "orange1",
"name": "orangeName1",
"path": "/some/path/b"
},
{
"id": "orange2",
"name": "orangeName2",
"path": "/some/path/c"
}
]
}
]
Expected output:
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "apple2",
"name": "appleName2",
"path": [
"/some/path/c"
]
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "orange2",
"name": "orangeName2",
"path": [
"/some/path/c"
]
}
]
}
]
I've managed to get an approximate solution using:
jq '[{type: .[].type, results: .[].results | group_by(.id) | map({id: .[0].id, name: .[0].name, path: (map(.path))})}]'
But my solution produces two additional elements that aren't supposed to be there.
I know there are some similar questions already answered but I didn't manage to get them to work with this example. Any help is appreciated!
You could group_by the .id field, then for each group take the first item and replace its .path field with a map on the .path fields of all group members:
jq 'map(.results |= (group_by(.id) | map(first + {path: map(.path)})))'
[
{
"type": "apple",
"results": [
{
"id": "apple1",
"name": "appleName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "apple2",
"name": "appleName2",
"path": [
"/some/path/c"
]
}
]
},
{
"type": "orange",
"results": [
{
"id": "orange1",
"name": "orangeName1",
"path": [
"/some/path/a",
"/some/path/b"
]
},
{
"id": "orange2",
"name": "orangeName2",
"path": [
"/some/path/c"
]
}
]
}
]
Demo

given the same sub key of a nested dictionary, subtract corresponding keys

I have a dictionary dct with two sets set1 and set2 of different lengths.
dct ={
"id": "1234",
"set1": [
{
"sub_id": "1234a",
"details": [
{
"sum": "10",
"label": "pattern1"
}
],
},
{
"sub_id": "1234b",
"details": [
{
"sum": "10",
"label": "pattern3"
}
],
}
],
"set2": [
{
"sub_id": "3463a",
"details": [
{
"sum": "20",
"label": "pattern1"
}
],
},
{
"sub_id": "3463b",
"details": [
{
"sum": "100",
"label": "pattern2"
}
],
},
{
"sub_id": "3463c",
"details": [
{
"sum": "100",
"label": "pattern3"
}
],
}
]
}
I need to check for each label if the corresponding sum has changed, and if yes, subtract these.
pairs1=[]
pairs2=[]
for d in dct['set1']:
for dd in d['details']:
pairs1.append((dd['label'],dd['sum']))
for d in dct['set2']:
for dd in d['details']:
pairs2.append((dd['label'],dd['sum']))
result={}
for p in pairs1:
for pp in pairs2:
if p[0] == pp[0]:
result[p[0]]= int(pp[1])-int(p[1])
result
Output something like:
{'pattern1': 10, 'pattern3': 90}
Is there a better way to iterate through the nested dictionary?

Why is google analytics return different number of results with same parameters?

Reporting API v4
I am a developer. I have my clients google adwords and analytics. I have been using adwords and analytics report API for almost a year now.
I am also using https://ga-dev-tools.appspot.com/query-explorer/. The query builder. For comparing if I have retrieve the right amount of data.
I don't know if its an error or not but its acting weird.
Try number 1 using https://ga-dev-tools.appspot.com/query-explorer/
I tried to add 2 metrics and 7 dimensions. This Account ID, contains 1 million data in only 1 month. I know this because I retrieved 1 million in a range of july 25, 2018 - august 16, 2018.
Then, here's the interesting part. I run the query again with the same parameters, it retrieves 5999 results. I did it again it returns 1 million. The results keep changing. I thought its the error in my code but its also happening in the query builder.
What do you guys think? is it a bug or not?
You can try this if you have more than a million data.
I know its not related to coding. But Google Analytics doesn't have forums just like Adwords.
Try number 2 using this link https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet
this is my request
{
"reportRequests": [
{
"dateRanges": [
{
"endDate": "2018-08-16",
"startDate": "2018-07-16"
}
],
"dimensions": [
{
"name": "ga:dimension2"
},
{
"name": "ga:dimension3"
},
{
"name": "ga:dimension1"
},
{
"name": "ga:adPlacementDomain"
}
],
"pageSize": 5,
"viewId": "********",
"samplingLevel": "LARGE",
"metrics": [
{
"expression": "ga:entrances"
},
{
"expression": "ga:newUsers"
}
],
"includeEmptyRows": true
}
]
}
The return of rowCount is sometimes 2111 and then 1000000.
This my response json with 1million result:
{
"reports": [
{
"columnHeader": {
"dimensions": [
"ga:dimension2",
"ga:dimension3",
"ga:dimension1",
"ga:adPlacementDomain"
],
"metricHeader": {
"metricHeaderEntries": [
{
"name": "ga:entrances",
"type": "INTEGER"
},
{
"name": "ga:newUsers",
"type": "INTEGER"
}
]
}
},
"data": {
"rows": [
{
"dimensions": [
"(other)",
"(other)",
"(other)",
"(other)"
],
"metrics": [
{
"values": [
"120834",
"68730"
]
}
]
},
{
"dimensions": [
"1000025873.1532426892",
"1532426891790.o9z84x",
"2018-07-24T11:08:15.449+01:00",
"unknown"
],
"metrics": [
{
"values": [
"0",
"0"
]
}
]
},
{
"dimensions": [
"1000025873.1532426892",
"1532426891790.o9z84x",
"2018-07-24T11:08:17.589+01:00",
"unknown"
],
"metrics": [
{
"values": [
"0",
"0"
]
}
]
},
{
"dimensions": [
"1000025873.1532426892",
"1532426891790.o9z84x",
"2018-07-24T11:08:31.809+01:00",
"unknown"
],
"metrics": [
{
"values": [
"0",
"0"
]
}
]
},
{
"dimensions": [
"1000025873.1532426892",
"1532427045552.p38pk78",
"2018-07-24T11:09:06.43+01:00",
"unknown"
],
"metrics": [
{
"values": [
"0",
"0"
]
}
]
}
],
"totals": [
{
"values": [
"158626",
"90225"
]
}
],
"rowCount": 1000000,
"minimums": [
{
"values": [
"0",
"0"
]
}
],
"maximums": [
{
"values": [
"120834",
"68730"
]
}
],
"isDataGolden": true
},
"nextPageToken": "5"
}
]
}
another response example when i have less 1million results:
{
"reports": [
{
"columnHeader": {
"dimensions": [
"ga:dimension2",
"ga:dimension3",
"ga:dimension1",
"ga:adPlacementDomain"
],
"metricHeader": {
"metricHeaderEntries": [
{
"name": "ga:entrances",
"type": "INTEGER"
},
{
"name": "ga:newUsers",
"type": "INTEGER"
}
]
}
},
"data": {
"rows": [
{
"dimensions": [
"1002211166.1531434756",
"1531762918308.fjnj7pa6",
"2018-07-16T18:41:58.307+01:00",
"mobileapp::2-com.forsbit.spider"
],
"metrics": [
{
"values": [
"1",
"0"
]
}
]
},
{
"dimensions": [
"1002211166.1531434756",
"1531771001486.jawfrpz8",
"2018-07-16T20:56:41.482+01:00",
"mobileapp::2-com.forsbit.spider"
],
"metrics": [
{
"values": [
"1",
"0"
]
}
]
},
{
"dimensions": [
"1002211166.1531434756",
"1531772475507.7n4w2qzb",
"2018-07-16T21:21:15.503+01:00",
"mobileapp::2-com.forsbit.spider"
],
"metrics": [
{
"values": [
"1",
"0"
]
}
]
},
{
"dimensions": [
"1002211166.1531434756",
"1531859165986.zl7we6a5",
"2018-07-17T21:26:05.977+01:00",
"mobileapp::2-com.forsbit.spider"
],
"metrics": [
{
"values": [
"1",
"0"
]
}
]
},
{
"dimensions": [
"1002211166.1531434756",
"1531859632678.dz7hccsa",
"2018-07-17T21:33:52.673+01:00",
"mobileapp::2-com.forsbit.spider"
],
"metrics": [
{
"values": [
"1",
"0"
]
}
]
},
{
"dimensions": [
"1002211166.1531434756",
"1531861026792.kw71ngx9",
"2018-07-17T21:42:31.667+01:00",
"mobileapp::2-com.forsbit.spider"
],
"metrics": [
{
"values": [
"1",
"0"
]
}
]
}
],
"totals": [
{
"values": [
"2111",
"233"
]
}
],
"rowCount": 2112,
"minimums": [
{
"values": [
"0",
"0"
]
}
],
"maximums": [
{
"values": [
"1",
"1"
]
}
],
"isDataGolden": true
},
"nextPageToken": "6"
}
]
}
I am assuming that you have kept all the queries intact. Double check just to make sure.
Second step would be to check for sampling. Check the field samplingSpaceSizes and samplesReadCounts in the response for sampling. If these fields were not defined that means no sampling was introduced.

Filtering objects in jq by existence of nested array values

Given a document like this:
[
{
"KVs" : [
{
"Key": "animal",
"Value": "lion"
},
{
"Key": "mascot",
"Value": "lion"
}
],
"name": "roger"
},
{
"KVs" : [
{
"Key": "animal",
"Value": "zebra"
},
{
"Key": "mascot",
"Value": "lion"
}
],
"name": "linda"
}
]
I want to use jq to select only those elements of the top array that contain the KV pair animal == "lion".
The output for the above JSON document should be:
{
"KVs" : [
{
"Key": "animal",
"Value": "lion"
},
{
"Key": "mascot",
"Value": "lion"
}
],
"name": "roger"
}
Can't figure out how to accomplish this with select(). I know how to use it to select based on one specific field. e.g. by key name: .[] | select(.KVs[].Key == "animal"), right? But how do I tell it to match the same KV object on two fields (Key & Value)?
NM, solved it with the help of jqplay and some trial and error.
This is the solution:
.[] | select(.KVs[] | .Key == "animal" and .Value == "lion")
(jqplay permalink)

Resources