generating a community graph in igraph - r

I have been searching for an answer to this question but could not find any mention, so I decided to post here. I am trying to see if igraph or any packages provide a simple way to create a "community graph" where each node represents a community in the network and the ties represent ties between the communities. I can get the community detection algorithm to work fine in igraph, but I could not find a way to collapse the results to just show connections between each community. Any assistance would be appreciated.

You can simply use the contract.vertices() function. This contracts groups of vertices into a single vertex, essentially the same way you want it. E.g.
library(igraph)
## create example graph
g1 <- graph.full(5)
V(g1)$name <- 1:5
g2 <- graph.full(5)
V(g2)$name <- 6:10
g3 <- graph.ring(5)
V(g3)$name <- 11:15
g <- g1 %du% g2 %du% g3 + edge('1', '6') + edge('1', '11')
## Community structure
fc <- fastgreedy.community(g)
## Create community graph, edge weights are the number of edges
cg <- contract.vertices(g, membership(fc))
E(cg)$weight <- 1
cg2 <- simplify(cg, remove.loops=FALSE)
## Plot the community graph
plot(cg2, edge.label=E(cg2)$weight, margin=.5, layout=layout.circle)

Related

Igraph R: Constraint calculation NA for some vertices for two combined graphs

Using R and igraph, I have two graphs I have combined, then run constraint() on. Constraint works on g1 but when I add g2, constraint() returns NA for the new vertices added from g2 and vertices next to these new vertices.
Here is example code that replicates my problem. I obtain g1 from a projection of a bipartite graph, as this reflects my data process.
My problem may be an issue with the union of the two graphs, but either way I try it (g3 <- g1 + g2 or g3 <- graph.union(g1, g2)) the constraint calculation is bringing NAs.
set.seed(42)
g <- sample_bipartite(100, 10, type = c("gnp"), p=.03, directed = FALSE)
gproj <- bipartite_projection(g, types=NULL, multiplicity = TRUE)
g1 <- gproj[[1]]
V(g1)$name <- 1:vcount(g1) #this gets their actual vertex id to show as label
V(g1)$name
components <- decompose.graph(g1)
largest <- which.max(sapply(components, vcount))
largest #this tells me which component is largest
lc <- components[[largest]]
lc
plot(lc)
cg1 <- constraint(lc)
cg1 #constraint for all connected vertices is calculated
#Create g2, some vertices in g1, some are new
rel <- data.frame( rel1 =
c(95), rel2 =
c(2000), stringsAsFactors = F) #create edgelist of g2, one vertex in lc, one vertex new
g2 <- graph.data.frame(rel, directed=FALSE)
#combine graphs and calculate constraint on combined graph
#g1 is used instead of lc because relationships in g2 may connect previously isolated vertices/components
g3 <- g1 + g2
components1 <- decompose.graph(g3)
largest1 <- which.max(sapply(components1, vcount))
largest1 #this tells me the first component is largest
lc1 <- components1[[largest1]]
plot(lc1)
cg3 <- constraint(lc1)
cg3 #now constraint vertices close to 2000 is 'NA'
For further information, other igraph measures such as eigenvector centrality, degree, and bonpower do not experience this issue.
Through some additional investigation, I have figured out a solution. The issue was g1 had weights, while g2 did not. With the weights removed, using this code (and using g1a down the line) constraint calculates for all vertices.
g1a <- remove.edge.attribute(g1, "weight")
Even better than it working for this toy example, this solution worked for my full, much larger, dataset.

Retrieve all edges of a community in igraph

How can I get all edges of a community in igraph?
g <- graph.edgelist(edges, directed=FALSE)
c <- edge.betweenness.community(g)
listOfCommunities <- communities(c)
listOfCommunities stores lists of verteces for each community. I would like to get all the edges for each community.

Inconsistency when plotting communities grouping

I have a problem while plotting the communities. Please consider the following MWE
library(igraph)
m <- matrix(c(0,0,0,0,0,0,
1,0,0,0,0,0,
0,0,0,0,1,0,
4,1,0,0,0,0,
0,0,0,0,0,1,
0,0,0,0,0,0),nrow=6,ncol=6)
g <- graph.adjacency(m)
memb <- membership(edge.betweenness.community(g))
memb
# [1] 1 1 2 1 2 2
I then expect to see two communities in the plot when doing
plot(g, mark.groups=list(memb), edge.width=0.5, edge.arrow.width=0.2)
But actually I get only one community
Am I doing something wrong?
You can plot the result of the community structure detection, the communities object, instead of plotting the graph. See the example in ?plot.communities.
ebc <- edge.betweenness.community(g)
plot(ebc, g)
If I understand your question correctly, then you are using the mark.groups argument wrong. Try
plot(g,
mark.groups=lapply(unique(memb), function(n) which(memb==n)),
edge.width=0.5,
edge.arrow.width=0.2)

finding modularity and community membership

I want to find 1) the modularity of a network and find 2) the identities of the nodes in each community.
I think this is the way to get modularity:
g <- graph.full(5) %du% graph.full(5) %du% graph.full(5)
g <- add.edges(g, c(1,6, 1,11, 6, 11))
ebc <- edge.betweenness.community(g)
sizes(ebc)
#Community sizes
#1 2 3
#5 5 5
modularity(g,membership(ebc))
#[1] 0.5757576
but on this link by Gabor I get this code:
memberships <- list()
G <- graph.full(5) %du% graph.full(5) %du% graph.full(5)
G <- add.edges(G, c(1,6, 1,11, 6, 11))
### edge.betweenness.community
ebc <- edge.betweenness.community(G)
mods <- sapply(0:ecount(G), function(i) {
g2 <- delete.edges(G, ebc$removed.edges[seq(length=i)])
cl <- clusters(g2)$membership
modularity(G, cl)
})
g2 <- delete.edges(G, ebc$removed.edges[1:(which.max(mods)-1)])
memberships$`Edge betweenness` <- clusters(g2)$membership
This seems to be doing the same thing that I am, I think that the delete.edges and clusters is about splitting the modules into separate components by deleting the connecting edges and then getting the IDs of nodes in each component.
Although there are a few questions I have:
In #Gabor Csardil 's code why does the modularity call use clusters(G)$membership and not ebc like I did in my example? What is the difference? (I would have thought that would overestimate actual modularity?) There also seems to be an alternate using community.to.membership
The mods code returns this error which I do not understand which is partly why I cannot explore g2 and cl to see more of what is happening:
Error in delete.edges(G, ebc$removed.edges[seq(length = i)]) :
At iterators.c:1809 : Cannot create iterator, invalid edge id, Invalid vertex id
The general answer is that the code in the wiki is outdated. It works with igraph version 0.5.x, but not with 0.6.x. More specifically:
In igraph version 0.5.x there was no direct way to get the membership vector, in version 0.6.x you can just say membership(ebc).
In version 0.5.x vertex ids start with zero, in version 0.6.x they start with one, so you don't meed to subtract one in the delete.edges() line.

2nd Degree Connections in igraph

I think have this working correctly, but I am looking to mimic something similar to Facebook's Friend suggestion. Simply, I am looking to find 2nd degree connections (friends of your friends that you do not have a connection with). I do want to keep this as a directed graph and identify the 2nd degree outward connections (the people your friends connect to).
I believe my dummy code is achieving this, but since the reference is on indices and not vertex labels, I was hoping you could help me modify the code to return useable names.
### create some fake data
library(igraph)
from <- sample(LETTERS, 50, replace=T)
to <- sample(LETTERS, 50, replace=T)
rel <- data.frame(from, to)
head(rel)
### lets plot the data
g <- graph.data.frame(rel)
summary(g)
plot(g, vertex.label=LETTERS, edge.arrow.size=.1)
## find the 2nd degree connections
d1 <- unlist(neighborhood(g, 1, nodes="F", mode="out"))
d2 <- unlist(neighborhood(g, 2, nodes="F", mode="out"))
d1;d2;
setdiff(d2,d1)
Returns
> setdiff(d2,d1)
[1] 13
Any help you can provide will be great. Obviously I am looking to stay within R.
You can index back into the graph vertices like:
> V(g)[setdiff(d2,d1)]
Vertex sequence:
[1] "B" "W" "G"
Also check out ?V for ways to get at this type of info through direct indexing.
You can use the adjacency matrix $G$ of the graph $g$ (no latex here?). One of the properties of the adjacency matrix is that its nth power gives you the number of $n$-walks (paths of length n).
G <- get.adjacency(g)
G2 <- G %*% G # G2 contains 2-walks
diag(G2) <- 0 # take out loops
G2[G2!=0] <- 1 # normalize G2, not interested in multiplicity of walks
g2 <- graph.adjacency(G2)
An edge in graph g2 represents a "friend-of-a-friend" bond.

Resources