finding modularity and community membership - r

I want to find 1) the modularity of a network and find 2) the identities of the nodes in each community.
I think this is the way to get modularity:
g <- graph.full(5) %du% graph.full(5) %du% graph.full(5)
g <- add.edges(g, c(1,6, 1,11, 6, 11))
ebc <- edge.betweenness.community(g)
sizes(ebc)
#Community sizes
#1 2 3
#5 5 5
modularity(g,membership(ebc))
#[1] 0.5757576
but on this link by Gabor I get this code:
memberships <- list()
G <- graph.full(5) %du% graph.full(5) %du% graph.full(5)
G <- add.edges(G, c(1,6, 1,11, 6, 11))
### edge.betweenness.community
ebc <- edge.betweenness.community(G)
mods <- sapply(0:ecount(G), function(i) {
g2 <- delete.edges(G, ebc$removed.edges[seq(length=i)])
cl <- clusters(g2)$membership
modularity(G, cl)
})
g2 <- delete.edges(G, ebc$removed.edges[1:(which.max(mods)-1)])
memberships$`Edge betweenness` <- clusters(g2)$membership
This seems to be doing the same thing that I am, I think that the delete.edges and clusters is about splitting the modules into separate components by deleting the connecting edges and then getting the IDs of nodes in each component.
Although there are a few questions I have:
In #Gabor Csardil 's code why does the modularity call use clusters(G)$membership and not ebc like I did in my example? What is the difference? (I would have thought that would overestimate actual modularity?) There also seems to be an alternate using community.to.membership
The mods code returns this error which I do not understand which is partly why I cannot explore g2 and cl to see more of what is happening:
Error in delete.edges(G, ebc$removed.edges[seq(length = i)]) :
At iterators.c:1809 : Cannot create iterator, invalid edge id, Invalid vertex id

The general answer is that the code in the wiki is outdated. It works with igraph version 0.5.x, but not with 0.6.x. More specifically:
In igraph version 0.5.x there was no direct way to get the membership vector, in version 0.6.x you can just say membership(ebc).
In version 0.5.x vertex ids start with zero, in version 0.6.x they start with one, so you don't meed to subtract one in the delete.edges() line.

Related

R igraph is_matching always False

I am trying to apply the is_matching function in the igraph R package. I don't know why my answer is always FALSE, even when it is clearly a matching. Here is my code:
library(igraph)
relations=data.frame(from=c(1,2),to=c(3,4))
g <- graph_from_data_frame(relations, directed=FALSE, vertices=1:4)
mm=c(1,3)
is_matching(g,mm)
[1] FALSE
I really appreciate any help!
I have no idea why this works and your code doesn't because they are nearly identical, but:
relations <- data.frame(from=c(1, 3),to=c(2,4))
g1 <- graph_from_data_frame(relations, directed=FALSE, vertices=c(1, 2, 3, 4))
mm <- c(2,1,4,3)
is_matching(g1, mm)
[1] TRUE
The difference here is that the vertices incident to the matching edges in mm are given in reverse order, e.g. (1->2, 3->4) is (2,1,4,3). This is strange because, if I construct the edge directions as you have (1->3, 2->4):
relations <- data.frame(from=c(1, 2),to=c(3,4))
g1 <- graph_from_data_frame(relations, directed=FALSE, vertices=c(1, 2, 3, 4))
mm <- c(3,1,4,2)
is_matching(g1, mm)
[1] FALSE
It comes out as FALSE. I tried to deconstruct the function's code and couldn't make sense of it, mostly because it calls commands that don't seem to exist in igraph, such as as.igraph.vs. If anybody can shed light on this, that would be great.

Ordering cluster list by cluster size, R igraph

I have a network (g) with thousands of clusters, however I can't seem to figure out how to order them by size. It looks like the membership attribute sorts clusters somewhat arbitrarily. For example:
c <- clusters(g)
c$membership
gs <- induced.subgraph(g, c$membership==1)
This will indeed give me the largest cluster, but if I try
gs <- induced.subgraph(g, c$membership==2)
It doesn't give me the second largest cluster, but an arbitrary cluster that happens to be second in the list.
Is there a way to order c$membership according to cluster size, i.e., 1 – largest, 2 – second largest, etc.?
You could do it this way:
# largest subgraph
gs <- induced.subgraph(g, c$membership==order(-c$csize)[1])
# second largest subgraph
gs <- induced.subgraph(g, c$membership==order(-c$csize)[2])
# etc...
Here's a working example.
library(igraph)
g <- graph.full(5) %du% graph.full(4) %du% graph.full(3)
set.seed(1) # for reproducible plots
par(mar=c(0,0,0,0),mfrow=c(1,2))
plot(g)
c <- clusters(g)
gs <- induced.subgraph(g, c$membership==order(-c$csize)[1])
plot(gs)

generating a community graph in igraph

I have been searching for an answer to this question but could not find any mention, so I decided to post here. I am trying to see if igraph or any packages provide a simple way to create a "community graph" where each node represents a community in the network and the ties represent ties between the communities. I can get the community detection algorithm to work fine in igraph, but I could not find a way to collapse the results to just show connections between each community. Any assistance would be appreciated.
You can simply use the contract.vertices() function. This contracts groups of vertices into a single vertex, essentially the same way you want it. E.g.
library(igraph)
## create example graph
g1 <- graph.full(5)
V(g1)$name <- 1:5
g2 <- graph.full(5)
V(g2)$name <- 6:10
g3 <- graph.ring(5)
V(g3)$name <- 11:15
g <- g1 %du% g2 %du% g3 + edge('1', '6') + edge('1', '11')
## Community structure
fc <- fastgreedy.community(g)
## Create community graph, edge weights are the number of edges
cg <- contract.vertices(g, membership(fc))
E(cg)$weight <- 1
cg2 <- simplify(cg, remove.loops=FALSE)
## Plot the community graph
plot(cg2, edge.label=E(cg2)$weight, margin=.5, layout=layout.circle)

Using igraph: community membership of components built by decompose.graph()

I would appreciate help with using decompose.graph, community detection functions from igraph and lapply.
I have an igraph object G with vertex attribute "label" and edge attribute "weight". I want to calculate community memberships using different functions from igraph, for simplicity let it be walktrap.community.
This graph is not connected, that is why I decided to decompose it
into connected components and run walktrap.community on each component, and afterwards add a community membership vertex attribute to the original graph G.
I am doing currently the following
comps <- decompose.graph(G,min.vertices=2)
communities <- lapply(comps,walktrap.community)
At this point I get stuck since I get the list object with the structure I cannot figure out. The documentation on decompose.graph tells only that it returns list object, and when I use lapply on the result I get completely confused. Moreover, the communities are numbered from 0 in each component, and I don't know how to supply weights parameter into walktrap.community function.
If it were not for the components, I would have done the following:
wt <- walktrap.community(G, modularity=TRUE, weights=E(G)$weight)
wmemb <- community.to.membership(G, wt$merges,steps=which.max(wt$modularity)-1)
V(G)$"walktrap" <- wmemb$membership
Could anyone please help me solve this issue? Or provide some
information/links which could help?
You could use a loop:
library(igraph)
set.seed(2)
G <- erdos.renyi.game(100, 1/50)
comps <- decompose.graph(G,min.vertices=2)
length(comps) # 2 components, in this example
for(i in seq_along(comps)) { # For each subgraph comps[[i]]
wt <- walktrap.community(comps[[i]], modularity=TRUE, weights=E(comps[[i]])$weight)
wmemb <- community.to.membership(comps[[i]], wt$merges,steps=which.max(wt$modularity)-1)
V(comps[[i]])$"walktrap" <- wmemb$membership
}
It is possible to do it with lapply and mapply, but it is less readable.
comps <- decompose.graph(G,min.vertices=2)
wt <- lapply( comps, function(u)
walktrap.community(u, modularity=TRUE, weights=E(u)$weight)
)
wmemb <- mapply(
function(u,v) community.to.membership(u, v$merges,steps=which.max(v$modularity)-1),
comps, wt,
SIMPLIFY=FALSE
)
comps <- mapply(
function(u,v) { V(u)$"walktrap" <- v$membership; u },
comps, wmemb,
SIMPLIFY=FALSE
)

2nd Degree Connections in igraph

I think have this working correctly, but I am looking to mimic something similar to Facebook's Friend suggestion. Simply, I am looking to find 2nd degree connections (friends of your friends that you do not have a connection with). I do want to keep this as a directed graph and identify the 2nd degree outward connections (the people your friends connect to).
I believe my dummy code is achieving this, but since the reference is on indices and not vertex labels, I was hoping you could help me modify the code to return useable names.
### create some fake data
library(igraph)
from <- sample(LETTERS, 50, replace=T)
to <- sample(LETTERS, 50, replace=T)
rel <- data.frame(from, to)
head(rel)
### lets plot the data
g <- graph.data.frame(rel)
summary(g)
plot(g, vertex.label=LETTERS, edge.arrow.size=.1)
## find the 2nd degree connections
d1 <- unlist(neighborhood(g, 1, nodes="F", mode="out"))
d2 <- unlist(neighborhood(g, 2, nodes="F", mode="out"))
d1;d2;
setdiff(d2,d1)
Returns
> setdiff(d2,d1)
[1] 13
Any help you can provide will be great. Obviously I am looking to stay within R.
You can index back into the graph vertices like:
> V(g)[setdiff(d2,d1)]
Vertex sequence:
[1] "B" "W" "G"
Also check out ?V for ways to get at this type of info through direct indexing.
You can use the adjacency matrix $G$ of the graph $g$ (no latex here?). One of the properties of the adjacency matrix is that its nth power gives you the number of $n$-walks (paths of length n).
G <- get.adjacency(g)
G2 <- G %*% G # G2 contains 2-walks
diag(G2) <- 0 # take out loops
G2[G2!=0] <- 1 # normalize G2, not interested in multiplicity of walks
g2 <- graph.adjacency(G2)
An edge in graph g2 represents a "friend-of-a-friend" bond.

Resources