R: Calculating adjacent vertex after deletion of nodes - r

I'm very new to R and trying to calculate the adjacent vertices of a graph, which is obtained from deleting certain nodes from an original graph.
However, the output of the result doesn't match with the plot of the graph.
For example:
library(igraph)
g <- make_ring(8)
g <- add_edges(g, c(1,2, 2,7, 3,6, 4,5, 8,2, 6,2))
V(g)$label <- 1:8
plot(g)
h <- delete.vertices(g, c(1,2))
plot(h)
If I compute:
adjacent_vertices(h,6)= 5
However, I want the output to be 3,5,7 as the plot shows. The problem lies in the fact that it doesn't know I'm trying to find the adjacent vertices of node labelled 6.
Could someone please help. Thanks.

The issue here is that when you delete the vertices, the indices for the remaining vertices are shifted down to [0,6]:
> V(h)
+ 6/6 vertices:
[1] 1 2 3 4 5 6
To find the neighbors, using the original vertex names, you could then simply offset the values by the number of vertices removed, e.g.:
> neighbors(h, 6 - offset) + offset
+ 3/6 vertices:
[1] 3 5 7
A better approach, however, would be to refer to the vertex labels instead of using the indices:
> V(g)$label
[1] 1 2 3 4 5 6 7 8
> V(h)$label
[1] 3 4 5 6 7 8
> V(h)[V(h)$label == 6]
+ 1/6 vertex:
[1] 4
To get the neighbors of your vertex of interest, you can modify your code to look like:
> vertex_of_interest <- V(h)[V(h)$label == 6]
> neighbors(h, vertex_of_interest)$label
[1] 3 5 7

Related

Contract verticies by attribute with igraph

I am working on a graph, where each node has an attribute "group" of the following: "Baby Product", "Book" "CE" "DVD" "Music" "Software" "Toy" "Video" "Video Games".
I would like to know how to plot a graph reppresenting those communities: there shall be 9 verticies, one for each group, and a link (possibly weighted) each time two nodes of two categories are connected.
I have tried using the igraph contract function, but this is the result:
> contract(fullnet, mapping=as.factor(products$group), vertex.attr.comb = products$group)
Error in FUN(X[[i]], ...) :
Unknown/unambigous attribute combination specification
Inoltre: Warning message:
In igraph.i.attribute.combination(vertex.attr.comb) :
Some attributes are duplicated
I guess I have misunderstood what this function is used for.
Now I am thinking about creating a new edgelist, made like the one before but instead of the Id of each vertex the name of the group. Sadly, I do not know how to do this in a fast way on an edgelist of over 1200000 elements.
Thank you very much in advance.
I think using contract() should be correct. In the example code below, I added an anonymous function to vertex.attr.comb to combine the vertices by group. Then, simplify() removes loop edges and calculate the sum of edge weight.
# Create example graph
set.seed(1)
g <- random.graph.game(10, 0.2)
V(g)$group <- rep(letters[1:3], times = c(3, 3, 4))
E(g)$weight <- 1:length(E(g))
E(g)
# + 9/9 edges from 7017c6a:
# [1] 2-- 3 3-- 4 4-- 7 5-- 7 5-- 8 7-- 8 3-- 9 2--10 9--10
E(g)$weight
# [1] 1 2 3 4 5 6 7 8 9
# Contract graph by `group` attribute of vertices
g1 <- contract(g, factor(V(g)$group),
vertex.attr.comb = function(x) levels(factor(x)))
# Remove loop edges and compute the sum of edge weight by group
g1 <- simplify(g1, edge.attr.comb = "sum")
E(g1)
# + 3/3 edges from a852397:
# [1] 1--2 1--3 2--3
E(g1)$weight
# [1] 2 15 12

Find number of mutual edges of vertices in igraph in R

This should be straightforward, but I want to obtain the number of mutual edges associated with all the vertices in my graph:
library(igraph)
ed <- data.frame(from = c(1,1,2,3,3), to = c(2,3,1,1,2))
ver <- data.frame(id = 1:3)
gr <- graph_from_data_frame(d = ed,vertices = ver, directed = T)
plot(gr)
I know I can use which_mutual for edges, but is there an equivalent command for getting something like this:
# vertex edges no_mutual
# 1 2 2
# 2 1 1
# 3 2 1
UDPATE: Corrected inconsistencies in output table as pointed out by emilliman5
Here's a one-liner solution:
> table(unlist(strsplit(attr(E(gr)[which_mutual(gr)],"vnames"),"\\|")))/2
1 2 3
2 1 1
It relies on getting the vertex names for each edge in an edgelist as the "vnames" attribute being a "|"-separated string. It then splits on that, then that gives you a table of all vertexes in mutual edges, and each one appears twice per edge so divide by two.
If there's a less hacky way of getting vertex names from an edgelist, I'm sure Gabor knows it.
Here's that trick in more detail:
For your graph gr:
> E(gr)
+ 5/5 edges (vertex names):
[1] 1->2 1->3 2->1 3->1 3->2
You can get vertexes for edges thus:
> attr(E(gr),"vnames")
[1] "1|2" "1|3" "2|1" "3|1" "3|2"
So my one-liner subsets that edge list my the mutuality criterion, then manipulates the strings.
I am not sure how well this will scale, but it gets the job done. Your expected table has some inconsistencies so I did the best I could, i.e. vertex 2 only has one originating edge not 2.
mutual_edges <- lapply(V(gr), function(x) which_mutual(gr, es = E(gr)[from(x) | to(x)]))
df <- data.frame(Vertex=names(mutual_edges),
Edges=unlist(lapply(V(gr), function(x) length(E(gr)[from(x)]) )),
no_mutual=unlist(lapply(mutual_edges, function(x) sum(x)/2)))
df
# Vertex Edges no_mutual
#1 1 2 2
#2 2 1 1
#3 3 2 1

Remove edges by specifying endpoints

How can I delete edges from a graph by naming their endpoints?
delete_edges expects edge numbers, and it's not clear to me the mapping between endpoints and edge numbers.
library(igraph)
g = make_ring(10)
Say I wanted to remove the vertices between nodes 7&8 and nodes 9&10.
A hackish way to do so is:
g = delete_edges(g, c(7, 9))
But I had to inspect the output of E(g) closely before figuring out that those edges are numbered 7 & 9.
I tried looking for how the print methods assign the node mapping to E(g) but it looks like quite the rabbit hole.
It looks like you can do this with a string argument -- see the second example in ?delete_edges.
g = delete_edges(g, c("7|8", "9|10"))
g
# IGRAPH U--- 10 8 -- Ring graph
# + attr: name (g/c), mutual (g/l), circular (g/l)
# + edges:
# [1] 1-- 2 2-- 3 3-- 4 4-- 5 5-- 6 6-- 7 8-- 9 1--10
Apparently c("7|8", "9|10") also counts as an "edge sequence" as described in the edges argument.
Nota that:
get.edge.ids(g, c(7,8, 9, 10))
will return edge ids 7, 9. Therefore
delete_edges(g, get.edge.ids(g, c(7,8, 9, 10)))
produces the desired result:
1] 1-- 2 2-- 3 3-- 4 4-- 5 5-- 6 6-- 7 8-- 9 1--10

Extract edges between community nodes and other nodes

Suppose we have a simple weighted network on which we perform some sort of community detection. Next we extract particular community and the final task is to extract all edges between nodes of this community and all other nodes.
Below I pasted the toy code.
# Create toy graph
library(igraph)
set.seed(12345)
g <- make_graph("Zachary")
# Add weights to edges
E(g)$weight <- sample(x = 1:10, size = ecount(g), replace = TRUE)
# Run community detection
cl <- cluster_louvain(g)
There are 5 nodes which belong to community #1, 12 nodes which belong to community #2, etc.
> table(membership(cl))
1 2 3 4
5 12 2 15
Now we extract community #1:
g1 <- induced_subgraph(g, which(cl$membership == 1))
Question: how to find edges which connect nodes in community #1 with all other nodes (excluding edges which define community #1)?
Start by getting all edges based in your community:
all_edges <- E(g)[inc(V(g)[membership(cl) == 1])]
all_edges
+ 10/78 edges:
[1] 1-- 5 1-- 6 1-- 7 1--11 5-- 7 5--11 6-- 7 6--11 6--17 7--17
Then, filter out the ones that are completely internal (both vertices are in the community):
all_edges_m <- get.edges(g, all_edges) #matrix representation
all_edges[!(
all_edges_m[, 1] %in% V(g)[membership(cl) == 1] &
all_edges_m[, 2] %in% V(g)[membership(cl) == 1]
)] # filter where in col1 and col2
+ 4/78 edges:
[1] 1-- 5 1-- 6 1-- 7 1--11

Clustering connected set of points (longitude,latitude) using R

I am working with centered longitude (x) and latitude(y) data. My goal is to clustering the connected locations.
Two location on earth (x1,y1) and (x2,y2) are said to be connected if earth_distance((x1,y1),(x2,y2))<15 kilometer.
I am using the distHaversine function in R, to calculate earth distance.
Here is some sample data,
x=c(1.000000, 1.055672, 1.038712, 1.094459, 1.133179, 1.116241, 1.126053, 1.181824 ,1.377892, 5.869881, 5.925270, 5.909721)
and
y=c(1.333368,1.304790,1.347332,1.318743,1.332676,1.375229,1.572287,1.544174,2.371105,2.337032,2.383415)
also
distance <- distHaversine(c(x,y))
I wish find the different clusters formed by the different connected set of points (each connected set of points form a cluster).
I looked at How to cluster points and plot but I could not solved my problem.
Any reference, suggestion or answer will be very much appreciated.
Maybe this. First make some coordinates:
> x=c(1.000000, 1.055672, 1.038712, 1.094459, 1.133179, 1.116241, 1.126053, 1.181824 ,1.377892, 5.869881, 5.925270)
> y=c(1.333368, 1.304790, 1.347332, 1.318743, 1.332676, 1.375229, 1.572287, 1.544174, 2.371105 ,2.337032, 2.383415)
Make into a data frame
> xy = data.frame(x=x,y=y)
Now use outer to loop over all pairs of rows and columns to compute a full distance matrix. This does twice as much work as is really necessary since it computes i to j and j to i for all i and j. Anyway, it gets us a distance matrix:
> dmat = outer(1:nrow(xy), 1:nrow(xy), function(i,j)distHaversine(xy[i,],xy[j,]))
Now we want a connectivity matrix, which is any pair closer than 15,000 metres:
> cmat = dmat < 15000
Now we use the igraph package to build a connectivity graph object:
> require(igraph)
> cgraph = graph.adjacency(cmat)
You can plot this to see the cluster formation, but note these are not plotted in your x-y space:
> plot(cgraph)
Now to get the connected clusters:
> clusters(cgraph)
$membership
[1] 1 1 1 1 1 1 2 2 3 4 4
$csize
[1] 6 2 1 2
$no
[1] 4
Which you can add to your data frame thus:
> xy$cluster = clusters(cgraph)$membership
> xy
x y cluster
1 1.000000 1.333368 1
2 1.055672 1.304790 1
3 1.038712 1.347332 1
4 1.094459 1.318743 1
5 1.133179 1.332676 1
6 1.116241 1.375229 1
7 1.126053 1.572287 2
8 1.181824 1.544174 2
9 1.377892 2.371105 3
10 5.869881 2.337032 4
11 5.925270 2.383415 4
And plot:
> plot(xy$x,xy$y,col=xy$cluster)

Resources