I have created my igraph from my dataset "allgenes", and found community modules based on the louvain method.
gD <- igraph::simplify(igraph::graph.data.frame(allgenes, directed=FALSE))
lou <- cluster_louvain(gD)
Plotting the modules, I note that there are several small communities that I wish to remove. How would I remove communities containing 5 nodes or less?
plot(lou, gD, vertex.label = NA, vertex.size=5, edge.arrow.size = .2)
Plot with distinguished modules:
Since you do not provide an example, I will illustrate with randomly generated data.
## First create an example like yours
library(igraph)
set.seed(123)
gD = erdos.renyi.game(50,0.05)
lou <- cluster_louvain(gD)
LO = layout_with_fr(gD)
plot(lou, gD, vertex.label = NA, vertex.size=5,
edge.arrow.size = .2, layout=LO)
## identify which communities have fewer than 5 members
Small = which(table(lou$membership) < 5)
## Which nodes should be kept?
Keep = V(gD)[!(lou$membership %in% Small)]
## Get subgraph & plot
gD2 = induced_subgraph(gD, Keep)
lou2 = cluster_louvain(gD2)
LO2 = LO[Keep,]
plot(lou2, gD2, vertex.label = NA, vertex.size=5,
edge.arrow.size = .2, layout=LO2)
The small communities have been removed
If you want to remove communities while maintaining the other existing communities you cannot create an induced subgraph with vertices you want to keep and cluster on the subgraph because the resulting communities can very likely change.
A workable approach would be to manually subset the communities object.
Also, if you want to plot the original graph and communities and new ones and maintain the same colors everywhere you have to do a couple additional steps.
suppressPackageStartupMessages(library(igraph))
set.seed(123)
g <- erdos.renyi.game(50, 0.05)
c <- cluster_louvain(g)
l <- layout_with_fr(g)
c_keep_ids <- as.numeric(names(sizes(c)[sizes(c) >= 5]))
c_keep_v_idxs <- which(c$membership %in% c_keep_ids)
g_sub <- induced_subgraph(g, V(g)[c_keep_v_idxs])
# igraph has no direct functionality to subset community objects so hack it
c_sub <- c
c_sub$names <- c$names[c_keep_v_idxs]
c_sub$membership <- c$membership[c_keep_v_idxs]
c_sub$vcount <- length(c_sub$names)
c_sub$modularity <- modularity(g_sub, c_sub$membership, E(g_sub)$weight)
par(mfrow = c(1, 2))
plot(c, g,
layout = l,
vertex.label = NA,
vertex.size = 5
)
plot(c_sub, g_sub,
col = membership(c)[c_keep_v_idxs],
layout = l[c_keep_v_idxs, ],
mark.border = rainbow(length(communities(c)), alpha = 1)[c_keep_ids],
mark.col = rainbow(length(communities(c)), alpha = 0.3)[c_keep_ids],
vertex.label = NA,
vertex.size = 5
)
par(mfrow = c(1, 1))
Allow me to add to this. I want to "remove" the color from small communities when visualizing, but keep them in the graph. e.g. I have a lot of isolates and that makes for some visual clutter while I have a very interesting core component, where looking at them gives a good representation.
I am starting with the code above. Not an issue, because I do not want subgraphs:
Small = which(table(g_community$membership) < 2)
g_community$membership[g_community$membership %in% Small] <- 999
This works well enough, but is there a smarter way to do this?
Related
I am trying to do something similar to this and this post. I have an igraph object and want to remove vertices(arrows) based on an values in a column of the edges dataframe, color the edges(circles) by a group, and change the line/arrow size based on the same column in the edges dataframe. Here is some reproducible code that looks exactly like my data:
# Data
edges <- data.frame(
"agency.from" = c(rep("a",4),rep("b",4),rep("c",4),rep("d",4)),
"agency.to" = c(rep(c("a","b","c","d"),4)),
"comm.freq" = sample(0:5,16, replace=TRUE))
nodes <- data.frame(
"agency" = c("a","b","c","d"),
"group" = c("x", "y", "x", "y"),
"state" = c("i", "j", "j", "i"))
# make igraph object
net <- graph_from_data_frame(d=edges, vertices=nodes, directed=T)
plot(net)
# remove loops
net2 <- simplify(net, remove.multiple = T, remove.loops = T)
plot(net2)
Which gives me:
this
# remove vertices where communication frequency is 1 and 0
net3 <- delete.vertices(net2, which(E(net2)$comm.freq == 1))
net4 <- delete.vertices(net3, which(E(net2)$comm.freq == 0))
plot(net4)
Which does not change the plot at all
Then I try to change the colors and sizes:
# color edges by group
colrs <- c("gray50", "tomato")
V(net4)$color <- colrs[V(net4)$group]
plot(net4)
# make size of arrow based on communication frequency
plot(net4, edge.width = E(net4)$comm.freq * 5, edge.arrow.size = E(net4)$comm.freq)
And still nothing changes
I followed the code provided in the other posts and I'm just really confused why nothing will work.
Any help is much appreciated!
The simplify() function removed your edge attributes. You need to specify how you want those values to be preserved when simplifying your graph. If you just want to keep the first possible value, you can do
net2 <- simplify(net, remove.multiple = T, remove.loops = T, edge.attr.comb=list("first"))
And then you use delete.vertices but you are passing indexes for edges, not vertices. If you want to drop both vertices that are adjacent to an edge with that given property, it should look more like
net3 <- delete_vertices(net2, V(net2)[.inc(E(net2)[comm.freq==1])])
net4 <- delete_vertices(net3, V(net3)[.inc(E(net3)[comm.freq==0])])
And then for the colors you have values like "x" and "y" for group, but you are indexing into the colrs vector which has no idea what "x" and "y" correspond to. It would be better to use a named vector. For example
colrs <- c(x="gray50", y="tomato")
V(net4)$color <- colrs[V(net4)$group]
Is there a way to delete (or selectively display) vertices but retain edges in an igraph plot? For example, in the code below, we delete vertices but that deletes edges between them. My goal is to highlight a specific node but keep all edges.
g <- make_ring(10) %>%
set_vertex_attr("name", value = LETTERS[1:10])
g
V(g)
g2 <- delete_vertices(g, c(1,5)) %>%
delete_vertices("B")
g2
V(g2)
If you delete the vertices, the edges no longer make any sense. However, if all you want is to not display the vertices, you can just use vertex.size=0.
plot(g, vertex.size=0)
If you do not want to even see the node names, add vertex.label=NA
You can show just one node by making a vector of vertex sizes and labels
VS = rep(0, vcount(g))
VS[2] = 14
VL = rep(NA, vcount(g))
VL[2] = V(g)$name[2]
VFC = rep(NA, vcount(g))
VFC[2] = "black"
VC = rep(NA, vcount(g))
VC[2] = 1
plot(g, vertex.size=VS, vertex.label=VL, vertex.color=VC,
vertex.frame.color=VFC)
inst2 = c(2, 3, 4, 5, 6)
motherinst2 = c(7, 8, 2, 10, 11)
km = c(20, 30, 40, 25, 60)
df2 = data.frame(inst2, motherinst2)
df2 = cbind(df2, km)
g2 = graph_from_data_frame(df2)
tkplot(g2)
how would I approach adding labels to exclusively my root and terminal vertices in a graph? I know it would involve this function, but how would you set it up? Assuming the graph object is just called 'g', or something obvious.
vertex.label =
The solution from #eipi1o is good, but the OP says "I'm finding it difficult to apply to my large data set effectively." I suspect that the issue is finding which are the intermediate nodes whose name should be blanked out. I will continue the example of #eipi10. Since my answer is based on his, if you upvote my answer, please upvote his as well.
You can use the neighbors function to determine which points are sources and sinks. Everything else is an intermediate node.
## original graph from eipi10
g = graph_from_edgelist(cbind(c(rep(1,10),2:11), c(2:21)))
## Identify which nodes are intermediate
SOURCES = which(sapply(V(g), function(x) length(neighbors(g, x, mode="in"))) == 0)
SINKS = which(sapply(V(g), function(x) length(neighbors(g, x, mode="out"))) == 0)
INTERMED = setdiff(V(g), c(SINKS, SOURCES))
## Fix up the node names and plot
V(g)$name = V(g)
V(g)$name[INTERMED] = ""
plot(g)
Using your example graph, we'll identify the root and terminal vertices and remove the labels for other vertices. Here's what the initial graph looks like:
set.seed(2)
plot(g2)
Now let's identify and remove the names of the intermediate vertices
# Get all edges
e = get.edgelist(g2)
# Root vertices are in first column but not in second column
root = setdiff(e[,1],e[,2])
# Terminal vertices are in second column but not in first column
terminal = setdiff(e[,2], e[,1])
# Vertices to remove are not in root or terminal vertices
remove = setdiff(unique(c(e)), c(root, terminal))
# Remove names of intermediate vertices
V(g2)$name[V(g2)$name %in% remove] = ""
set.seed(2)
plot(g2)
Original Answer
You can use set.vertex.attribute to change the label names. Here's an example:
library(igraph)
# Create a graph to work with
g = graph_from_edgelist(cbind(c(rep(1,10),2:11), c(2:21)))
plot(g)
Now we can remove the labels from the intermediate vertices:
g = set.vertex.attribute(g, "name", value=c(1,rep("", length(2:11)),12:21))
plot(g)
I have a network that looks like this
library(igraph)
library(igraphdata)
data("kite")
plot(kite)
I run a community detection and the result looks like this
community <- cluster_fast_greedy(kite)
plot(community,kite)
Now I want to extract a network based on the communities. The edge weight should be the number of ties between communities (how strong are communities connected to each other), the vertex attribute should be the number of nodes in the community (called numnodes).
d <- data.frame(E=c(1, 2, 3),
A=c(2, 3, 1))
g2 <- graph_from_data_frame(d, directed = F)
E(g2)$weight <- c(5, 1, 1)
V(g2)$numnodes <- c(4,3,3)
plot.igraph(g2,vertex.label=V(g2)$name, edge.color="black",edge.width=E(g2)$weight,vertex.size=V(g2)$numnodes)
The graph should look like this
One node is larger than the others, one edge has a lot of weight in comparison to the others.
As far as I know, igraph doesn't have method to count edges connecting groups of vertices. Therefore to count the edges connecting communities you need to iterate over each pairs of communities. To count the members for each community, you can use the sizes method.
library(igraph)
library(igraphdata)
data("kite")
plot(kite)
community <- cluster_fast_greedy(kite)
plot(community,kite)
cedges <- NULL
for(i in seq(1,max(community$membership) - 1)){
for(j in seq(i + 1, max(community$membership))){
imembers <- which(community$membership == i)
jmembers <- which(community$membership == j)
weight <- sum(
mapply(function(v1) mapply(
function(v2) are.connected(kite, v1, v2),
jmembers),
imembers)
)
cedges <- rbind(cedges, c(i, j, weight))
}
}
cedges <- as.data.frame(cedges)
names(cedges)[3] <- 'weight'
cgraph <- graph_from_data_frame(cedges, directed = FALSE)
V(cgraph)$numnodes <- sizes(community)
plot.igraph(cgraph,
vertex.label = V(cgraph)$name,
edge.color = "black",
edge.width = E(cgraph)$weight,
vertex.size = V(cgraph)$numnodes)
I have a graph net with two different types (1 and 2) of vertices, appearing n1 and n2 times, respectively:
net %v% "type" <- c(rep("1", n1), rep("2", n2))
We have some edges which were generated randomly with probabilities ps and pd, where ps is the edge probability with a same type (1-1 or 2-2) and pd with a different type (1-2).
I would like to plot this graph such that the edges between same types (i.e. 1-1 or 2-2) have a different color than edges between different types (1-2).
How do I do this?
I tried playing around with the %e% operator of the network package, but I'm confused about how to grab the type of the end node of each edge.
Thank you!
Do you want that?
from <- sample(1:2, 10, replace = T)
to <- sample(1:2, 10, replace = T)
node <- cbind(from, to)
library(igraph)
net <- graph_from_edgelist(node, directed = F)
edge_color <- function(from_to){
from_node <- from_to[1]
to_node <- from_to[2]
ifelse(from_node == to_node, return("red"), return("blue"))
}
color<- apply(node, 1, edge_color)
plot(net, edge.color=color)