Create a group vertex for each group, and create outgoing edges to the group vertices - gremlin

I have a gremlin query which groups vertices based on two properties
g.V().hasLabel("PERSON").
group().
by(values('favorite_brand', 'favorite_color').fold()).
next()
It returns a list of each group mapped to the list of the vertices in group
1 {('adidas', 'blue'): [v[123], v[456]]}
2 {('nike', 'red'): [v[789]]}
How can I: for each group, create a vertex with an outgoing edge to all the vertices in that group and also set the new group vertex properties to be the same
For example for the above, I would create two new vertices. Group Vertex 1 would have 'favorite_brand' as adidas and 'favorite_color' as blue and would have two outgoing edges to the two vertices 123 and 456.
Same for Group Vertex 2
Is there a way in gremlin to carry this query or do I have to store the returned hashmap in a variable and for loop in my lambda to create new vertices? I'm familiar with addV step but how would I iterate through each element in the hashmap and then access the list value? Thanks!
I have look at the tinkerpop official documentation to understand group step but then didn't find enough information on how to iterate through results and perform actions

Not having your data, the answer below is built using the air-routes data set. The initial group can be built using:
g.V().hasId('3','8','12','13','14').
group().
by(values('region', 'country').fold()).
unfold()
which yields
1 {('US-CA', 'US'): [v[13]]}
2 {('US-NY', 'US'): [v[12], v[14]]}
3 {('US-TX', 'US'): [v[3], v[8]]}
From there we can build a query to unroll the group while creating the new nodes and edges.
g.V().hasId('3','8','12','13','14').
group().
by(values('region', 'country').fold()).
unfold().as('grp').
addV('group').as('new-node').
property('region',select(keys).limit(local,1)).
property('country',select(keys).tail(local)).
sideEffect(select('grp').select(values).unfold().addE('new-edge').from('new-node'))
which shows us the nodes created but will also have created the edges inside the sideEffect.
1 v[26c2d653-5e9c-3ec9-0854-6ed2a212c63b]
2 v[80c2d653-5e9c-b838-063c-82f6d21cd6e5]
3 v[42c2d653-5e9d-31c7-2c02-c7bf72fe8e38]
We can use the query below to verify everything has worked.
g.V().hasLabel('group').out().path().by(valueMap()).by(id())
Which returns
1 path[{'region': ['US-CA'], 'country': ['US']}, 13]
2 path[{'region': ['US-NY'], 'country': ['US']}, 12]
3 path[{'region': ['US-NY'], 'country': ['US']}, 14]
4 path[{'region': ['US-TX'], 'country': ['US']}, 3]
5 path[{'region': ['US-TX'], 'country': ['US']}, 8]
I used Amazon Neptune to build this answer but it should work on other TinkerPop compliant stores.

Related

Neo4j Creating relationship between nodes that already map to a relationship

Currently, I have a query in Neo4j that returns all the nodes in my graph that are pointed to by multiple nodes.
The query to return these nodes (in picture) looks like this:
MATCH (n)-[r:CLINICAL_SIGNIFICANCE]->()
WITH n, count(r) as rel_cnt
WHERE rel_cnt > 1
MATCH (c)-[r:PROTEIN_CHANGE]->(n)
return c, n
Is there a way to loop through the nodes labeled as c (blue nodes) and if they point to the same node labeled as n (yellow nodes), create a relationship between the nodes labeled as c (blue nodes)?

In iGraph in R, collapsing 2 nodes into 1 node [duplicate]

I am trying to merge two nodes (call them 'V' and 'U') in a graph (G) into a single node (V).
G is a hyperlink network of 779 nodes (websites). Each edge represents a hyperlink. V and U are actually the same website, but unfortunately the webpages from that website have become split into two separate nodes. So I want to put them back together into a single node.
I have researched the contract.vertices function, but I cannot understand how to adapt it here.
Here are the attributes of my graph (G).
> G
IGRAPH D--- 779 3544 --
+ attr: Image File (v/c), Ringset (v/n), Country Code TLD (v/n), Generic TLD (v/n), Number of Pages (v/n), Categorical 1 (v/n), Categorical 2 (v/n),
Categorical 3 (v/n), id (v/c), label (v/c), Width (e/n)
I have two nodes that I want to merge together:
> V(g)$id[8]
[1] "http://www.police.uk/"
and
> V(g)$id[14]
[1] "http://police.uk/"
In total there are 779 nodes and 3544 edges in the graph.
I want these two nodes to become a single node in the graph (i.e. they will have the same "id"). All inlinks and outlinks from/to other nodes will now point only to this new single node.
All other attributes will remain the same, with the exception of Number of Pages (the value of this will be the sum of both the nodes before they are merged).
contract.vertices is indeed the right function to try, but its API is a bit complicated since it is designed to be able to merge not only a single pair of nodes but also several pairs in a single pass. (It can also permute vertices). To this end, it requires a mapping from the old vertex IDs to the new ones.
In case you are unfamiliar with vertex IDs: igraph identifies each vertex of the graph with an integer in the range 1 to N where N is the number of vertices. The mapping that contract.vertices requires must be a list of length N where the i-th element of the list contains the new ID of the node corresponding to ID i before merging.
Suppose that your graph contains 10 nodes. The following mapping vector will simply map each node to the same ID that it already has, so it will not do any merging:
c(1,2,3,4,5,6,7,8,9,10)
Now, suppose that you want to merge node 7 into node 4. You have to tell igraph that the new ID of node 7 will be 4, so you have to change the 7th element in the above vector to 4:
c(1,2,3,4,5,6,4,8,9,10)
This will almost do the job; the problem is that igraph requires the node IDs to be in the range 1 to N and since you still have a node with ID 10 according to the above mapping, igraph will not delete the old node 7. You can either delete it manually with delete.vertices after you contracted the vertices, or you can specify a different mapping that not only merges node 7 into node 4 but also changes the ID of node 8 to 7, node 9 to 8 and node 10 to 9:
c(1,2,3,4,5,6,4,7,8,9)
Now, since you also want the Number of Pages attribute of the new node to be the sum of the values of the two old nodes, you must tell igraph what to do with the vertex attributes during the merge. The vertex.attr.comb parameter of contract.vertices serves this purpose. In your case, the value of vertex.attr.comb should be something like this:
list("Number of Pages"="sum", "first")
where "Number of Pages"="sum" means that the new value of the Number of Pages attribute should be calculated by summing the old attribute values, and "first" means that for all other attributes not mentioned here, the new value should be determined by the old value of the first node among the set of nodes that are merged into a single one. See ?attribute.combination in R for more details about the format of this argument.

Query ArangoDB general-graph for common neighbors with more than 2 start vertices?

Consider the following example graph:
 
Given the vertices A, B and C (creators), how to figure out their common neighbors?(projects all 3 participated in)For two vertices, I could simply use GRAPH_COMMON_NEIGHBORS("myGraph", A, B), but what if I want to query for 3 or more? Expected result: 1 and 2.
Given the same vertices, how can I make it return common neighbors with no other connections?(creators exclusively participated in a project, no additional edges allowed)?Expected result: 1, because 2 has an edge coming from D, which isn't one of the starting vertices.
You can simply pass the same set of vertices as both parameters for common neighbors. Then repack the result in a better format for AQL to compute the intersection:
let res = (
let nodes = ["a/A","a/B","a/C"]
for n in GRAPH_COMMON_NEIGHBORS("g",nodes , nodes)
for f in VALUES(n)
return VALUES(f)
)
return CALL("intersection", res[0])

How do I merge two nodes into a single node using igraph

I am trying to merge two nodes (call them 'V' and 'U') in a graph (G) into a single node (V).
G is a hyperlink network of 779 nodes (websites). Each edge represents a hyperlink. V and U are actually the same website, but unfortunately the webpages from that website have become split into two separate nodes. So I want to put them back together into a single node.
I have researched the contract.vertices function, but I cannot understand how to adapt it here.
Here are the attributes of my graph (G).
> G
IGRAPH D--- 779 3544 --
+ attr: Image File (v/c), Ringset (v/n), Country Code TLD (v/n), Generic TLD (v/n), Number of Pages (v/n), Categorical 1 (v/n), Categorical 2 (v/n),
Categorical 3 (v/n), id (v/c), label (v/c), Width (e/n)
I have two nodes that I want to merge together:
> V(g)$id[8]
[1] "http://www.police.uk/"
and
> V(g)$id[14]
[1] "http://police.uk/"
In total there are 779 nodes and 3544 edges in the graph.
I want these two nodes to become a single node in the graph (i.e. they will have the same "id"). All inlinks and outlinks from/to other nodes will now point only to this new single node.
All other attributes will remain the same, with the exception of Number of Pages (the value of this will be the sum of both the nodes before they are merged).
contract.vertices is indeed the right function to try, but its API is a bit complicated since it is designed to be able to merge not only a single pair of nodes but also several pairs in a single pass. (It can also permute vertices). To this end, it requires a mapping from the old vertex IDs to the new ones.
In case you are unfamiliar with vertex IDs: igraph identifies each vertex of the graph with an integer in the range 1 to N where N is the number of vertices. The mapping that contract.vertices requires must be a list of length N where the i-th element of the list contains the new ID of the node corresponding to ID i before merging.
Suppose that your graph contains 10 nodes. The following mapping vector will simply map each node to the same ID that it already has, so it will not do any merging:
c(1,2,3,4,5,6,7,8,9,10)
Now, suppose that you want to merge node 7 into node 4. You have to tell igraph that the new ID of node 7 will be 4, so you have to change the 7th element in the above vector to 4:
c(1,2,3,4,5,6,4,8,9,10)
This will almost do the job; the problem is that igraph requires the node IDs to be in the range 1 to N and since you still have a node with ID 10 according to the above mapping, igraph will not delete the old node 7. You can either delete it manually with delete.vertices after you contracted the vertices, or you can specify a different mapping that not only merges node 7 into node 4 but also changes the ID of node 8 to 7, node 9 to 8 and node 10 to 9:
c(1,2,3,4,5,6,4,7,8,9)
Now, since you also want the Number of Pages attribute of the new node to be the sum of the values of the two old nodes, you must tell igraph what to do with the vertex attributes during the merge. The vertex.attr.comb parameter of contract.vertices serves this purpose. In your case, the value of vertex.attr.comb should be something like this:
list("Number of Pages"="sum", "first")
where "Number of Pages"="sum" means that the new value of the Number of Pages attribute should be calculated by summing the old attribute values, and "first" means that for all other attributes not mentioned here, the new value should be determined by the old value of the first node among the set of nodes that are merged into a single one. See ?attribute.combination in R for more details about the format of this argument.

How to delete vertices in graph in R?

I’m trying to remove the vertex in a graph with delete.vertices(graph ,vertex id) but I see this error: ”Cannot create iterator, Invalid vertex id, Invalid vertex id”
This is my code:
graph<-read.graph(paste0("H:/graph.txt"),format="ncol",directed="TRUE")
delete.vertices(graph,c(717527,61373))
717527 and 61373 are my vertices number.
and this is the graph:
How can I do this?
You get the error because one of the vertices given is not in the list of the graph vertices. One advise try to play with igraph functions, and once your are familiar with them apply it to your custom graph. Here a demonstration:
library(igraph)
g <- graph.ring(10)
V(g)
Vertex sequence:
[1] 1 2 3 4 5 6 7 8 9 10
Now I try to delete an existing vertex:
delete.vertices(g, c(2,7,9) )
IGRAPH U--- 7 4 -- Ring graph
+ attr: name (g/c), mutual (g/x), circular (g/x)
For non existing one , I get your error:
delete.vertices(g, c(20))
Error in delete.vertices(g, c(20)) :
At iterators.c:759 : Cannot create iterator, invalid vertex id, Invalid vertex id
Are you using Igraph?
the read.graph function may be reading vertex id's in as strings. In that case you may want:
delete.vertices(graph,c('717527','61373'))
instead. Also, delete.verticies returns a new graph object. You want to make sure to store the result in your graph if you want to see the changes show up latter.
graph<-read.graph(paste0("H:/graph.txt"),format="ncol",directed="TRUE")
graph<-delete.vertices(graph,c('717527','61373'))

Resources