Combine community detection with connected components grouping igraph R - r

I use igraph cluster_spinglass to detect compartments (communities) in a directed network but that only works for connected components
g <- graph_from_literal( 1 -+ 4 -+ 7,2 -+ 5 -+ 9, 4+-5,
3 -+ 6,5 -+8, 8-+ 9, simplify = FALSE)
m<-cluster_spinglass(g)
Gives an error, the solution is to extract the connected component
dg <- components(g)
g1 <- induced_subgraph(g, which(dg$membership == which.max(dg$csize)))
m<-cluster_spinglass(g1)
I get the memberships of the nodes (vertices) with
m$membership
But here I don't have all the nodes of the original network g, I would like to add another group with these nodes so I have all the original nodes clasified in different groups.

You can just transfer this into your original graph g.
In your example, I think that you just want the vertices in the
other connected component to be another community, it suffices to assign all nodes in the second component to group 3.
V(g)$membership = 3
V(g)[V(g1)$name]$membership = m$membership
V(g)$membership
[1] 1 1 1 2 2 2 3 3 2
But in a more general example, there might be multiple components and those components might break up into multiple communities.
To cover that, you can loop through all components, compute the communities and then transfer those back to the original graph.
V(g)$membership = 0
for(comp in unique(dg$membership)) {
g1 <- induced_subgraph(g, which(dg$membership == comp))
m<-cluster_spinglass(g1)
V(g)[V(g1)$name]$membership = m$membership + max(V(g)$membership)
}
V(g)$membership
[1] 1 1 1 2 2 2 3 3 2

Related

Get a graph from all vertex with incident edges R igraph

I would like to obtain a subgraph from a graph, composed of all the vertex with incident edges starting from some vertices, and following the edges until there are no more incident edges. With the following code I only get the first neighbours
g <- graph_from_literal( 1 -+ 4 -+ 5 -+ 8,2 -+ 5 , 3-+6-+7, 4+-3, 4-+8, 5 -+9, simplify = FALSE)
adjacent_vertices(g, V(g)[c("7","9")], mode="in")
I know that I should make some kind of loop but adjacent_vertices returns a list and I can't figure out how to make it.
For this example, the result should be
graph_from_literal( 1 -+ 4 -+ 5 ,2 -+ 5 , 3-+6-+7, 4+-3, 5 -+9, simplify = FALSE)
make_ego_graph can be used to find subgraphs in the neighbourhood of specific nodes.
You can search through the full graph by setting the order parameter in
the function make_ego_graph.
Example
library(igraph)
# Your graph
g = graph_from_literal( 1 -+ 4 -+ 5 -+ 8, 2 -+ 5 , 3-+6-+7, 4+-3, 4-+8, 5 -+9,
simplify = FALSE)
# Set the order equal to the number of nodes in the graph
sg = make_ego_graph(g, nodes = V(g)[c("7","9")], order=length(V(g)), mode="in")
# This returns two subgraphs as node 3 has no inward edges and so the graph 3->6->7
# is unconnected to the other nodes. You can join the subgraphs by using
do.call(union, sg)

R Igraph subgraph given node index and number of nodes to include in the graph

I want to plot a portion of a graph according to a specific node and ideally a distance from that node or a number of nodes as part of the sub graph.
The data.frame that I am graphing is as follows:
Column 1 Column 2 Sequence
A B 1
A D 2
D B 3
Z E 4
E D 5
this is the code:
network <- graph.data.frame(data_to_graph[,c(1,2)])
subnetwork <- induced.subgraph(network, vids = 30, impl = 'copy_and_delete', eids = c(5,6,7,8,9,10,11,12,13,14,15))
plot(subnetwork)
I would like, by specifying an element of column 1 to plot a graph at a certain distance from that node.
Thanks
Dario.
This is the answer:
distan <- 3
node <- "node name"
subnetwork <- induced.subgraph(network, vids = as.vector(unlist(neighborhood(network, distan, nodes = node, mode = 'all'))))
plot.igraph(subnetwork, vertex.size=10)

r igraph find all cycles

I have directed igraph and want to fetch all the cycles. girth function works but only returns the smallest cycle. Is there a way in R to fetch all the cycles in a graph of length greater then 3 (no vertex pointing to itself and loops)
It is not directly a function in igraph, but of course you can code it up. To find a cycle, you start at some node, go to some neighboring node and then find a simple path back to the original node. Since you did not provide any sample data, I will illustrate with a simple example.
Sample data
## Sample graph
library(igraph)
set.seed(1234)
g = erdos.renyi.game(7, 0.29, directed=TRUE)
plot(g, edge.arrow.size=0.5)
Finding Cycles
Let me start with just one node and one neighbor. Node 2 connects to Node 4. So some cycles may look like 2 -> 4 -> (Nodes other than 2 or 4) -> 2. Let's get all of the paths like that.
v1 = 2
v2 = 4
lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p))
[[1]]
[1] 2 4 2
[[2]]
[1] 2 4 3 5 7 6 2
[[3]]
[1] 2 4 7 6 2
We see that there are three cycles starting at 2 with 4 as the second node. (I know that you said length greater than 3. I will come back to that.)
Now we just need to do that for every node v1 and every neighbor v2 of v1.
Cycles = NULL
for(v1 in V(g)) {
for(v2 in neighbors(g, v1, mode="out")) {
Cycles = c(Cycles,
lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p)))
}
}
This gives 17 cycles in the whole graph. There are two issues though that you may need to look at depending on how you want to use this. First, you said that you wanted cycles of length greater than 3, so I assume that you do not want the cycles that look like 2 -> 4 -> 2. These are easy to get rid of.
LongCycles = Cycles[which(sapply(Cycles, length) > 3)]
LongCycles has 13 cycles having eliminated the 4 short cycles
2 -> 4 -> 2
4 -> 2 -> 4
6 -> 7 -> 6
7 -> 6 -> 7
But that list points out the other problem. There still are some that you cycles that you might think of as duplicates. For example:
2 -> 7 -> 6 -> 2
7 -> 6 -> 2 -> 7
6 -> 2 -> 7 -> 6
You might want to weed these out. To get just one copy of each cycle, you can always choose the vertex sequence that starts with the smallest vertex number. Thus,
LongCycles[sapply(LongCycles, min) == sapply(LongCycles, `[`, 1)]
[[1]]
[1] 2 4 3 5 7 6 2
[[2]]
[1] 2 4 7 6 2
[[3]]
[1] 2 7 6 2
This gives just the distinct cycles.
Addition regarding efficiency and scalability
I am providing a much more efficient version of the code that I
originally provided. However, it is primarily for the purpose of
arguing that, except for very simple graphs, you will not be able
produce all cycles.
Here is some more efficient code. It eliminates checking many
cases that either cannot produce a cycle or will be eliminated
as a redundant cycle. In order to make it easy to run the tests
that I want, I made it into a function.
## More efficient version
FindCycles = function(g) {
Cycles = NULL
for(v1 in V(g)) {
if(degree(g, v1, mode="in") == 0) { next }
GoodNeighbors = neighbors(g, v1, mode="out")
GoodNeighbors = GoodNeighbors[GoodNeighbors > v1]
for(v2 in GoodNeighbors) {
TempCyc = lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p))
TempCyc = TempCyc[which(sapply(TempCyc, length) > 3)]
TempCyc = TempCyc[sapply(TempCyc, min) == sapply(TempCyc, `[`, 1)]
Cycles = c(Cycles, TempCyc)
}
}
Cycles
}
However, except for very simple graphs, there is a combinatorial
explosion of possible paths and so finding all possible cycles is
completely impractical I will illustrate this with graphs much smaller
than the one that you mention in the comments.
First, I will start with some small graphs where the number of edges
is approximately twice the number of vertices. Code to generate my
examples is below but I want to focus on the number of cycles, so I
will just start with the results.
## ecount ~ 2 * vcount
Nodes Edges Cycles
10 21 15
20 41 18
30 65 34
40 87 424
50 108 3433
55 117 22956
But you report that your data has approximately 5 times as
many edges as vertices. Let's look at some examples like that.
## ecount ~ 5 * vcount
Nodes Edges Cycles
10 48 3511
12 61 10513
14 71 145745
With this as the growth of the number of cycles, using 10K nodes
with 50K edges seems to be out of the question. BTW, it took several
minutes to compute the example with 14 vertices and 71 edges.
For reproducibility, here is how I generated the above data.
set.seed(1234)
g10 = erdos.renyi.game(10, 0.2, directed=TRUE)
ecount(g10)
length(FindCycles(g10))
set.seed(1234)
g20 = erdos.renyi.game(20, 0.095 , directed=TRUE)
ecount(g20)
length(FindCycles(g20))
set.seed(1234)
g30 = erdos.renyi.game(30, 0.056 , directed=TRUE)
ecount(g30)
length(FindCycles(g30))
set.seed(1234)
g40 = erdos.renyi.game(40, 0.042 , directed=TRUE)
ecount(g40)
length(FindCycles(g40))
set.seed(1234)
g50 = erdos.renyi.game(50, 0.038 , directed=TRUE)
ecount(g50)
length(FindCycles(g50))
set.seed(1234)
g55 = erdos.renyi.game(55, 0.035 , directed=TRUE)
ecount(g55)
length(FindCycles(g55))
##########
set.seed(1234)
h10 = erdos.renyi.game(10, 0.55, directed=TRUE)
ecount(h10)
length(FindCycles(h10))
set.seed(1234)
h12 = erdos.renyi.game(12, 0.46, directed=TRUE)
ecount(h12)
length(FindCycles(h12))
set.seed(1234)
h14 = erdos.renyi.game(14, 0.39, directed=TRUE)
ecount(h14)
length(FindCycles(h14))

Find number of mutual edges of vertices in igraph in R

This should be straightforward, but I want to obtain the number of mutual edges associated with all the vertices in my graph:
library(igraph)
ed <- data.frame(from = c(1,1,2,3,3), to = c(2,3,1,1,2))
ver <- data.frame(id = 1:3)
gr <- graph_from_data_frame(d = ed,vertices = ver, directed = T)
plot(gr)
I know I can use which_mutual for edges, but is there an equivalent command for getting something like this:
# vertex edges no_mutual
# 1 2 2
# 2 1 1
# 3 2 1
UDPATE: Corrected inconsistencies in output table as pointed out by emilliman5
Here's a one-liner solution:
> table(unlist(strsplit(attr(E(gr)[which_mutual(gr)],"vnames"),"\\|")))/2
1 2 3
2 1 1
It relies on getting the vertex names for each edge in an edgelist as the "vnames" attribute being a "|"-separated string. It then splits on that, then that gives you a table of all vertexes in mutual edges, and each one appears twice per edge so divide by two.
If there's a less hacky way of getting vertex names from an edgelist, I'm sure Gabor knows it.
Here's that trick in more detail:
For your graph gr:
> E(gr)
+ 5/5 edges (vertex names):
[1] 1->2 1->3 2->1 3->1 3->2
You can get vertexes for edges thus:
> attr(E(gr),"vnames")
[1] "1|2" "1|3" "2|1" "3|1" "3|2"
So my one-liner subsets that edge list my the mutuality criterion, then manipulates the strings.
I am not sure how well this will scale, but it gets the job done. Your expected table has some inconsistencies so I did the best I could, i.e. vertex 2 only has one originating edge not 2.
mutual_edges <- lapply(V(gr), function(x) which_mutual(gr, es = E(gr)[from(x) | to(x)]))
df <- data.frame(Vertex=names(mutual_edges),
Edges=unlist(lapply(V(gr), function(x) length(E(gr)[from(x)]) )),
no_mutual=unlist(lapply(mutual_edges, function(x) sum(x)/2)))
df
# Vertex Edges no_mutual
#1 1 2 2
#2 2 1 1
#3 3 2 1

R: Calculating adjacent vertex after deletion of nodes

I'm very new to R and trying to calculate the adjacent vertices of a graph, which is obtained from deleting certain nodes from an original graph.
However, the output of the result doesn't match with the plot of the graph.
For example:
library(igraph)
g <- make_ring(8)
g <- add_edges(g, c(1,2, 2,7, 3,6, 4,5, 8,2, 6,2))
V(g)$label <- 1:8
plot(g)
h <- delete.vertices(g, c(1,2))
plot(h)
If I compute:
adjacent_vertices(h,6)= 5
However, I want the output to be 3,5,7 as the plot shows. The problem lies in the fact that it doesn't know I'm trying to find the adjacent vertices of node labelled 6.
Could someone please help. Thanks.
The issue here is that when you delete the vertices, the indices for the remaining vertices are shifted down to [0,6]:
> V(h)
+ 6/6 vertices:
[1] 1 2 3 4 5 6
To find the neighbors, using the original vertex names, you could then simply offset the values by the number of vertices removed, e.g.:
> neighbors(h, 6 - offset) + offset
+ 3/6 vertices:
[1] 3 5 7
A better approach, however, would be to refer to the vertex labels instead of using the indices:
> V(g)$label
[1] 1 2 3 4 5 6 7 8
> V(h)$label
[1] 3 4 5 6 7 8
> V(h)[V(h)$label == 6]
+ 1/6 vertex:
[1] 4
To get the neighbors of your vertex of interest, you can modify your code to look like:
> vertex_of_interest <- V(h)[V(h)$label == 6]
> neighbors(h, vertex_of_interest)$label
[1] 3 5 7

Resources