How to compare communities in two consecutive graphs - r

I have the same graph represented at two different times, g.t0 and g.t1. g.t1 differs from g.t0 for having one additional edge but maintains the same vertices.
I want to compare the communities in g.t0 and g.t1, that is, to test whether the vertices moved to a different community from t0 to t1. I tried the following
library(igraph)
m <- matrix(c(0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0),nrow=4,ncol=4)
g.t0 <- graph.adjacency(m)
memb.t0 <- membership(edge.betweenness.community(g.t0))
V(g.t0)
# Vertex sequence:
# [1] 1 2 3 4
memb.t0
# [1] 1 2 2 3
g.t1 <- add.edges(g.t0,c(1,2))
memb.t1 <- membership(edge.betweenness.community(g.t1))
V(g.t1)
# Vertex sequence:
# [1] 1 2 3 4
memb.t1
# [1] 1 1 1 2
But of course the problem is that the indexing of the communities always start from 1. Then in the example it seems that all the vertices have moved to a different community, but the most intuitive reading is that actually only the vertex 1 changed community, moving with 2 and 3.
How could I approach the problem of counting the number of vertices that changed communities from t0 to t1?

Actually this is not an easy question. In general you need to match the communities in the two graphs, using some rule or criteria that the matching optimizes. As you can have different number of communities, the matching is not necessarily bijective.
There were several methods and quantities proposed for this problem, a bunch is implemented in igraph, see
http://igraph.org/r/doc/compare.html
compare.communities(memb.t1, memb.t0, method="vi")
# [1] 0.4773856
compare.communities(memb.t1, memb.t0, method="nmi")
# [1] 0.7020169
compare.communities(memb.t1, memb.t0, method="rand")
# [1] 0.6666667
See the references in the igraph manual for the details about the methods.

Related

How to create all non-isomorphic trees with n=6 nodes?

I need to create all non-isomorphic trees with n=6 nodes. I have found the degree sequence and try to generate trees this degree.sequence.game() function:
library(igraph)
set.seed(46)
par(mfrow=c(2, 3))
degs <- matrix(c(1,1,1,2,2,3,
1,1,1,3,2,2,
1,1,2,2,2,2,
1,1,1,1,2,4,
1,1,1,1,1,5,
1,1,1,1,3,3), nrow=6, byrow=T)
for(i in 1:6){
g6 <- degree.sequence.game(degs[i,], method="vl")
plot(g6, vertex.label=NA)
}
The output is:
One can see graphs A and B in left figure are isomorphic.
Expected result in right figure.
Question. What is an alternative method to create non-isomorphic trees?
Update
It seems I misunderstood your objective. Below might be one solution if you try simple.no.multiple.uniform option with in degree.sequence.game, i.e.,
g6 <- degree.sequence.game(degs[i, ], method = "simple.no.multiple.uniform")
and we can obtain
BTW, the version of igraph I am using is igraph_1.3.5 (you can see it when typing sessionInfo() in the console) and you can try with this version, which hopefully helps to address your problem as well.
Previous Answer
I think the pain point in your problem is "How to find all distinct degree sequences with given number of vertices in a tree graph?".
We can break this primary problem into two sub-problems:
What is the sum of degrees given n vertices (if we want generate a tree)? The answer is: 2*(n-1)
How to partition the 2*(n-1) into n non-isomorphic groups that consist of positive integers? the answer is: Using partitions::restrictedparts
library(partitions)
n <- 6
degs <- t(restrictedparts(2*(n-1), n, include.zero = FALSE)
and you will see
> degs
[1,] 1 1 1 1 1 5
[2,] 1 1 1 1 2 4
[3,] 1 1 1 1 3 3
[4,] 1 1 1 2 2 3
[5,] 1 1 2 2 2 2
then you can use degree.sequence.game(degs[i,], method="vl") by iterating i through 1 to nrow(degs).

Getting the biggest connected component in R igraph

How do I get a subgraph of the the biggest component of a graph?
Say for example I have a graph g.
size_components_g <-clusters(g, mode="weak")$csize
size_components_g
#1 2 3 10 25 2 2 1
max_size <- max(size_components_g)
max_size
#25
So 25 is the biggest size.
I want to extract the component that has these 25 vertices. How do I do that?
Well, detailed explanation of output value of any function in the R package could be found in its documentation. In this case igraph::clusters returns a named list where in csize sizes of clusters are stored while membership contains the cluster id to which each vertex belongs to.
g <- igraph::sample_gnp(20, 1/20)
components <- igraph::clusters(g, mode="weak")
biggest_cluster_id <- which.max(components$csize)
# ids
vert_ids <- V(g)[components$membership == biggest_cluster_id]
# subgraph
igraph::induced_subgraph(g, vert_ids)

Find all closed loops in a network

This question is basically a duplicate of this, however I'm interested in solutions in R.
Does anyone know an approach with igraph or other CRAN-based packages which would allow you to identify closed loops (for example, DGHD, BCDB, or BCEFDB, if the letters are nodes)?
Note that I have a relatively large network with ~ 700 edges and ~ 100 nodes, so it would be good if the solution is not computationally too expensive.
One more important piece of information is that my network is directed.
I am assuming that you are only interested in paths that do not go through any node twice except that the beginning equals the end. With a little work, you can do this in igraph using all_simple_paths. The key point to notice is that any closed loop without repeated nodes is a simple path from a vertex, v, to one of v's neighbors, followed by the single link from the neighbor back to v. I will show how to get all simple closed loops like this starting and ending with a single node. You can simply loop through all of the nodes if you want all examples in the graph.
First, we need some example data.
library(igraph)
set.seed(1234)
g = erdos.renyi.game(8,0.35)
plot(g)
I will get the closed loops starting and ending at node 8, because that node shows the interesting issues.
V = 8
SP = all_simple_paths(g, from=V, to=neighbors(g, v=V))
We do not want to include paths that just go to a neighbor and directly back (like 8-2-8) so we eliminate the paths with just one link.
SP2 = SP[sapply(SP, function(p) length(p)> 2)]
Depending on what you want, we might be done here, but I suspect that you do not want both a path and the same path in reverse, e.g. I think that you do not want both 8-2-5-8 and 8-5-2-8. We can get rid of these duplicates by insisting that the first neighbor (the second node in the path) has a smaller index than the last one.
SP3 = SP2[sapply(SP2, function(p) p[2] < p[length(p)])]
But we have also left off the return to the first node, so we add the first node on to the end of each path.
SP4 = lapply(SP3, function(p) c(unclass(p), V))
SP4
[[1]]
[1] 8 2 5 8
[[2]]
[1] 8 2 5 4 8
[[3]]
[1] 8 2 5 7 3 4 8
[[4]]
[1] 8 4 3 7 5 8
[[5]]
[1] 8 4 5 8

R, igraph walktrap.community splits all nodes in the case of a fully connected graph

Using the walktrap.community approach for defining communities within my graph works great - of all the algorithms I tested it performs the best. The caveat is that in the case of a fully connected graph with no self linkages (every node connects to each other node, but not itself) each node is assigned its own community.
I am not experienced in network analysis but this seems like an interesting case and its certainly not desired behavior. How can I avoid this splitting in my actual data?
library(igraph)
match.mat = matrix(T, nrow=8, ncol=8)
diag(match.mat)[1:8] = T
topology = which(match.mat, arr.ind=T)
g = graph.data.frame(topology, directed=F)
cm = walktrap.community(g)
membership(cm)
# 2 3 4 5 6 7 8 1
# 1 1 1 1 1 1 1 1
plot(cm, g)
diag(match.mat)[1:8] = F
topology = which(match.mat, arr.ind=T)
g = graph.data.frame(topology, directed=F)
cm = walktrap.community(g)
membership(cm)
#2 3 4 5 6 7 8 1
#1 2 3 4 5 6 7 8
plot(cm, g)
Conceptually I'm not sure how the lack of self linkages would lead to every node being split - maybe possible communities are all tied and therefore split? But the case of all self linkages would seem equivalent in that regard.
Thanks!
http://www-rp.lip6.fr/~latapy/Publis/communities.pdf
If you have read the paper carefully, you will note that the Walktrap builds a node distance measure based on the random walk transition matrix. However, this transition matrix needs to be ergodic, therefore its underlying adjacency matrix needs to be connected and non-bipartite. Non-bipartiteness is achieved by adding self loops to the nodes. Therefore, you need to add self loops to each node in your graph. Maybe it will be a good idea for the future to include this correction in the igraph package, but as far as I know they are using the C implementation of Latapy and Pons and for this one the graph needs to have self loops. Hope this answers your question!

count cycles in network

What is the best way, or are there any ways implemented in are to count both 3 and 4 cycles in networks.
3 cycles equal connected groups of three nodes(triangles) to be calculated from one mode networks
4 cycles equal connected groups of four nodes(squares) to be calculated from two mode networks
If i have networks like this:
onemode <- read.table(text= "start end
1 2
1 3
4 5
4 6
5 6",header=TRUE)
twomode <- read.table(text= "typa typev
aa a
bb b
bb a
aa b",header=TRUE)
I thought
library(igraph)
g <- graph.data.frame(twomode)
E(g)
graph.motifs(g, size = 4)
would count the number of squares in my two mode network but I dont understand the output. I thought the result would be 1
?graph.motifs
graph.motifs searches a graph for motifs of a given size and returns a
numeric vector containing the number of different motifs. The order of
the motifs is defined by their isomorphism class, see graph.isoclass.
So the output of this is numeric vector where each value is the count of a certain motif(with sizes is 4 or 3) in your graph.
graph.motifs(g,size=4)
To get the total number of the motifs, you can use graph.motifs.no
graph.motifs.no(g,size=4)
[1] 1
Which is the number of the motif 20
which(graph.motifs(g,size=4) >0)
[1] 20
Another function that might be easier to use for this taks is kcycle.census {sna}. Details: http://svitsrv25.epfl.ch/R-doc/library/sna/html/path.census.html

Resources