Querying Path in Neo4j, How To Show Node/Edge Info Only Once? - graph

If a simple query for a path between two nodes was made, say,
MATCH (m{name:'m'}), (n{name:'n'}),
path = (m)-[:SOME_EDGE*]->(n)
RETURN path
EDIT:
(example result)
...
segments: [
{
start: {
id: 1
labels: [lbl1, lbl2, ...],
properties: [p1, p2, ...]
}
end: { ... }
properties: { ... }
},
{
start: {
id: 1
labels: [lbl1, lbl2, ...] <--- duplicate
properties: [p1, p2, ...] <--- duplicate
}
},
...
]
the result generated contains many duplicates of properties/types/IDs of sames nodes/edges again and again, and this gets worse when there are cycles in the paths.
I googled and found that I could use projections like
return [node in nodes(path) | id(node)] as pathNodes,
[r in relationships(path) | id: id(r), type: type(r)] as rels
(Example result)
{
pathNodes: [1,2,3],
rels: {id:101,type:'SOME_EDGE'},{id:102,type:'SOME_EDGE'}
},
{
pathNodes: [1,2,1,3],
rels: ...
}, ...
But how to add nodes/relationships info (just once for one entity) to the result above?
Is there any way to get this done in a single query?

You need to collect paths, nodes and relationships into a distinct lists, and then make a maps based on them using apoc.map.setKey function:
MATCH path = (A)-[*]->(B)
UNWIND nodes(path) AS n
UNWIND relationships(path) AS r
WITH
collect(DISTINCT path) as paths,
collect(DISTINCT n) AS nodes,
collect(DISTINCT r) AS rels
RETURN
[p IN paths | {
nodes: [n IN nodes(p) | id(n)],
rels: [r IN relationships(p) | id(r)]
}] as paths,
reduce(acc={}, n IN nodes | apoc.map.setKey(acc, toString(id(n)), n)) as nodes,
reduce(acc={}, r IN rels | apoc.map.setKey(acc, toString(id(r)), r)) as rels

stdob-- was right about UNWIND and COLLECT and didn't really need to use APOC.
I came up with a solution by myself months ago and came here today so I chose his/her answer and also post my solution without APOC here.
UNWIND and re-COLLECT is the key
MATCH p=(m{name:'m'})-[:'SOME_EDGE'|:'SOME_OTHER_EDGE'*1..2]->(n{name:'n'})
WITH {
pathNodes: [node IN nodes(p) | ID(node)],
rels: [r IN RELATIONSHIPS(p) | {id:ID(r),ty:TYPE(r)}]
} AS path, p
UNWIND NODES(p) AS node
RETURN {paths:COLLECT(path), nodes: COLLECT(DISTINCT{id:ID(node),name:node.name})}

Related

Jq parse duplicate json file using jqplay.org

Can I output the ip and source id only when source id is duplicate it should out put all ip in one array if no duplicate ip can be output with corresponding source id
{"ip":"192.134.5.31","access_key":"223434354656767","source_id":"2e74a68a-2fef-443544-815d-87"}
{"ip":"172.23.54.4","saccess_key":"223434354656767","source_id":"2e74a68a-2fef-443544-815d-87"}
{"ip":"182,555,44.44","access_key":"223434354656767","source_id":"2e74a68a-2fef-443544-815d-222"}
I dont care about access key here also if this access key can be done with Jq would be great
unique_by(.ip) |{ip.source_id[]}
.ip| select(.source_id[])
You should group_by to group all the matching source_id.
Then you can create the desired output, for example:
group_by(.source_id)[] | { ip: map(.ip), source_id: (first.source_id) }
Will output:
{
"ip": [
"182,555,44.44"
],
"source_id": "2e74a68a-2fef-443544-815d-222"
}
{
"ip": [
"192.134.5.31",
"172.23.54.4"
],
"source_id": "2e74a68a-2fef-443544-815d-87"
}
Since
We group on source_id
We create an object for each group, containing
An map() from all the lower ip's and the source_id taken from the first object
Use the --slurp option to combine those objects in to an array:
jq --slurp 'group_by(.source_id)[] | { ip: map(.ip), source_id: (first.source_id) }'
JqPlay Demo
I'm not sure what your expected output is, but if you are trying to group IPs by source id like such:
{
"2e74a68a-2fef-443544-815d-222": [
"182,555,44.44"
],
"2e74a68a-2fef-443544-815d-87": [
"192.134.5.31",
"172.23.54.4"
]
}
Then you can group by your source_id and then transform the output:
group_by(.source_id) | map({(.[0].source_id): map(.ip)}) | add
Or by using a custom function:
def group(k): group_by(k) | map({key:first|k, value:.}) | from_entries;
group(.source_id) | map_values(map(.ip))
or
def group(k;v): group_by(k) | map({key:first|k, value:map(v)}) | from_entries;
group(.source_id;.ip)

How to extrapolate values in one AWS CLI output with values from two separate CLI outputs as input files?

I am trying to build an audit/compliance report from IAM identity center. We need a list of groups and the respective group members. At current count we have 1,500+ users and 700+ Groups across 120 accounts in AWS.
There isn't an API command to spit this data out, so I'm putting a few commands together to extract the groups to files in Cloudshell. Then I need to cross-reference and throw everything into a CSV for filtering in Excel for the auditors.
Retrieve UserName and UserID - store in UserID.json
aws identitystore list-users --identity-store-id d-123456789| jq '.Users[] | {Name: .UserName, ID:.UserId}' > UsersIds.json
Retrieve Groups and GroupIDs - store in GroupsID.json
aws identitystore list-groups --identity-store-id d-123456789| jq '.Groups[] | {GroupName: .DisplayName, ID:.GroupId}' > GroupsID.json
Retrieve list of All Users per Group - store in GroupMembers.json
result=$(aws identitystore list-groups --identity-store-id d-123456789| jq -r '.Groups[].GroupId')
for val in $result; do
aws identitystore list-group-memberships --identity-store-id d-123456789--group-id $val | jq -r '.GroupMemberships[] | \
{GroupID: .GroupId, Member:User.Id} ' >> GroupMembers.json
done
Example output from UserIds.json:
{
"Name": "first.last#example.com",
"ID": "123456789-9876543210-ABCD-4321-1234"
}
{
"Name": "last.first#example.com",
"ID": "12345678-4321-1234-2233-9876543210"
}
Example output from GroupsID.json:
{
"GroupName": "sso-aws-zone-role-CloudCoreOps",
"ID": "123456789-55668877-1234-5522-2255-987654321"
}
{
"GroupName": "sso-aws-zone-role-CloudCoreRO",
"ID": "1234567890-11224455-2255-5522-1343-9876543210"
}
Example Output from GroupsMembers.json:
{
"GroupID": "123456789-55668877-1234-5522-2255-987654321",
"Member": "123456789-9876543210-ABCD-4321-1234"
}
{
"GroupID": "1234567890-11224455-2255-5522-1343-9876543210",
"Member": "12345678-4321-1234-2233-9876543210"
}
Now I just need to correlate and I have read you can use JQ like SED. So, that means I should be able to replace the key values in GroupMembers.json. First is to replace the GroupID with the correct GroupName matched from the GroupsID.json file and the Member with the User Name that matches the ID from the UserID.json file.
I think this can be done in a loop, but I want need to learn not only how to do this, but the best way.
It should be doable with INDEX and JOIN in a two-level nesting:
jq --slurpfile users UserIds.json --slurpfile groups GroupsID.json '
JOIN($groups | INDEX(.ID);
JOIN($users | INDEX(.ID); .; .Member; add);
.GroupID; add) | {Name, GroupName}
' GroupsMembers.json
{
"Name": "first.last#example.com",
"GroupName": "sso-aws-zone-role-CloudCoreOps"
}
{
"Name": "last.first#example.com",
"GroupName": "sso-aws-zone-role-CloudCoreRO"
}

Compare json files but ignore values

I would like to compare two json files and report differencies but I am interested in keys only and not values. So for example the "json-diff" between the following two files (of course they are much more complicated):
{
"http": {
"https": true,
"swagger": {
"enabled": false
},
"scalingFactors": [0.1, 0.2]
}
}
{
"http": {
"https": true,
"swagger": {
"enabled": true
},
"scalingFactors": [0.1, 0.1],
"test": true
}
}
should report that there is missing key:
http.test
but
should not report that the following keys have different values:
http.swagger.enabled
http.scalingFactors
I looked at the jq tool but I am not sure how to ignore values.
Ignoring potential complications having to do with arrays, looking at the "symmetric difference" of the sets of paths to scalars would make sense. As a starting point, you could thus consider:
jq -c '
[paths(scalars)] as $f1
| [input | paths(scalars)] as $f2
| ($f1 - $f2) + ($f2 - $f1)' file1.json file2.json
You might want to stringify the paths, but then again, it might be wise to avoid doing so if the mapping to the strings is not invertible.
If arrays are present, you might want to compare the paths while ignoring the array indices:
def p: [paths(scalars) | map(select(type=="string"))] | unique;
p as $f1
| (input | p) as $f2
| ($f1 - $f2) + ($f2 - $f1)
| .[]
The last line ensures that the result is a (possibly empty) stream, the point being that this makes it easy to check the return code to determine whether any difference was detected: simply use the -e command-line option. If there are no differences, the return code will then be 4.
One way to check if the stream is empty would be to use the -4

jq: Use context object as key in query from root

I have a JSON object where the relevant parts are of the form
{
"_meta": {
"hostvars": {
"name_1": {
"ansible_host": "10.0.0.1"
},
"name_2": {
"ansible_host": "10.0.0.2"
},
"name_3": {
"ansible_host": "10.0.0.3"
}
}
},
...
"nodes": {
"hosts": [
"name_1",
"name_2"
]
}
}
(the output of ansible-inventory --list, for reference).
I would like to use jq to produce a list of IPs of the nodes hosts by looking up the names in ._meta.hostvars. In the example, the output should be:
10.0.0.1
10.0.0.2
Note that 10.0.0.3 should not be included because name_3 is not in the .nodes.hosts list. So just doing jq -r '._meta.hostvars[].ansible_host' doesn't work.
I've tried jq '.nodes.hosts[] | ._meta.hostvars[.].ansible_host' but that fails because ._meta doesn't scan from the root after the pipe.
You can store the root in a variable before changing the context:
jq -r '. as $root | .nodes.hosts[] | $root._meta.hostvars[.].ansible_host'
But a better solution is to just inline the "hosts" query:
jq -r '._meta.hostvars[.nodes.hosts[]].ansible_host'

Parsing JSON dict of CloudFormation parameters for '--parameter-overrides'

I'm using AWS CloudFormation at the moment, and I need to parse out parameters due to differences between stack creation and deployment. Command aws cloudformation create accepts a JSON file, but aws cloudformation deploy only accepts inlined application parameters of Key=Value type.
I have this JSON file:
[
{
"ParameterKey": "EC2KeyPair",
"ParameterValue": "$YOUR_EC2_KEY_PAIR"
},
{
"ParameterKey": "SSHLocation",
"ParameterValue": "$YOUR_SSH_LOCATION"
},
{
"ParameterKey": "DjangoEnvVarDebug",
"ParameterValue": "$YOUR_DJANGO_ENV_VAR_DEBUG"
},
{
"ParameterKey": "DjangoEnvVarSecretKey",
"ParameterValue": "$YOUR_DJANGO_ENV_VAR_SECRET_KEY"
},
{
"ParameterKey": "DjangoEnvVarDBName",
"ParameterValue": "$YOUR_DJANGO_ENV_VAR_DB_NAME"
},
{
"ParameterKey": "DjangoEnvVarDBUser",
"ParameterValue": "$YOUR_DJANGO_ENV_VAR_DB_USER"
},
{
"ParameterKey": "DjangoEnvVarDBPassword",
"ParameterValue": "$YOUR_DJANGO_ENV_VAR_DB_PASSWORD"
},
{
"ParameterKey": "DjangoEnvVarDBHost",
"ParameterValue": "$YOUR_DJANGO_ENV_VAR_DB_HOST"
}
]
And I want to turn it into this:
'EC2KeyPair=$YOUR_EC2_KEY_PAIR SSHLocation=$YOUR_SSH_LOCATION DjangoEnvVarDebug=$YOUR_DJANGO_ENV_VAR_DEBU
G DjangoEnvVarSecretKey=$YOUR_DJANGO_ENV_VAR_SECRET_KEY DjangoEnvVarDBName=$YOUR_DJANGO_ENV_VAR_DB_NAME D
jangoEnvVarDBUser=$YOUR_DJANGO_ENV_VAR_DB_USER DjangoEnvVarDBPassword=$YOUR_DJANGO_ENV_VAR_DB_PASSWORD Dj
angoEnvVarDBHost=$YOUR_DJANGO_ENV_VAR_DB_HOST'
This would be the equivalent Python code:
thing = json.load(open('stack-params.example.json', 'r'))
convert = lambda item: f'{item["ParameterKey"]}={item["ParameterValue"]}'
>>> print(list(map(convert, thing)))
['EC2KeyPair=$YOUR_EC2_KEY_PAIR', 'SSHLocation=$YOUR_SSH_LOCATION', 'DjangoEnvVarDebug=$YOUR_DJANGO_ENV_V
AR_DEBUG', 'DjangoEnvVarSecretKey=$YOUR_DJANGO_ENV_VAR_SECRET_KEY', 'DjangoEnvVarDBName=$YOUR_DJANGO_ENV_
VAR_DB_NAME', 'DjangoEnvVarDBUser=$YOUR_DJANGO_ENV_VAR_DB_USER', 'DjangoEnvVarDBPassword=$YOUR_DJANGO_EN$
_VAR_DB_PASSWORD', 'DjangoEnvVarDBHost=$YOUR_DJANGO_ENV_VAR_DB_HOST']
>>> ' '.join(map(convert, thing))
'EC2KeyPair=$YOUR_EC2_KEY_PAIR SSHLocation=$YOUR_SSH_LOCATION DjangoEnvVarDebug=$YOUR_DJANGO_ENV_VAR_DEBU
G DjangoEnvVarSecretKey=$YOUR_DJANGO_ENV_VAR_SECRET_KEY DjangoEnvVarDBName=$YOUR_DJANGO_ENV_VAR_DB_NAME D
jangoEnvVarDBUser=$YOUR_DJANGO_ENV_VAR_DB_USER DjangoEnvVarDBPassword=$YOUR_DJANGO_ENV_VAR_DB_PASSWORD Dj
angoEnvVarDBHost=$YOUR_DJANGO_ENV_VAR_DB_HOST'
I have this little snippet:
$ cat stack-params.example.json | jq '.[] | "\(.ParameterKey)=\(.ParameterValue)"'
"EC2KeyPair=$YOUR_EC2_KEY_PAIR"
"SSHLocation=$YOUR_SSH_LOCATION"
"DjangoEnvVarDebug=$YOUR_DJANGO_ENV_VAR_DEBUG"
"DjangoEnvVarSecretKey=$YOUR_DJANGO_ENV_VAR_SECRET_KEY"
"DjangoEnvVarDBName=$YOUR_DJANGO_ENV_VAR_DB_NAME"
"DjangoEnvVarDBUser=$YOUR_DJANGO_ENV_VAR_DB_USER"
"DjangoEnvVarDBPassword=$YOUR_DJANGO_ENV_VAR_DB_PASSWORD"
"DjangoEnvVarDBHost=$YOUR_DJANGO_ENV_VAR_DB_HOST"
But I'm not sure how to join the strings together. I was looking at reduce but I think it only works on lists, and streams of strings aren't lists. So I'm thinking the correct approach is to convert the key : value association into 'key=value' strings within the list, then join altogether, though I have trouble working with the regex. Does anybody have any tips?
The goal as exemplified by the illustrative output seems highly dubious, but it can easily be achieved using the -r command-line option together with the filter:
map("\(.ParameterKey)=\(.ParameterValue)") | "'" + join(" ") + "'"
Footnote
I was looking at reduce but I think it only works on lists, and streams of strings aren't lists.
To use reduce on a list, say $l, you could simply use [] as in:
reduce $l[] as $x (_;_)

Resources