I have a very large bipartite network model that I created from 5 million lines of a dataset. I decompose my network model because I can not draw a graph of this size. Now all I need is to plot the decompose graphics one by one. There is no problem with that. But I want to draw the graph with a shape according to the attributes of each node. For example, I want a square for the "A" attributes on my graph G, and a triangle for the "B" attributes. In addition to this I want to add vertex labels by attributes. Here is my codes to plot first component of graph after creating bipartite G and its work:
components <- decompose(G)
plot(components[[1]])
I tried something like this to adding labels and changing vertex shapes according to graph attributes but it didn't work:
plot(components[[1]], vertex.label= V(G)$attributes,
vertex.shape=c("square", "triangle"))
Does anyone can help me, I'm stuck. Thank you so much!
the components function returns a list of vertices which make up a component. So you need to traverse the list, create a subgraph and plot. As for plotting attributes you need to provide a reproducible example for us to help.
library(igraph)
set.seed(8675309)
g <- sample_gnp(200, p = 0.01)
V(g)$name <- paste0("Node", 1:vcount(g))
V(g)$shape <- sample(c("circle","square"), vcount(g), replace = T)
clu <- components(g)
grps <- groups(clu)
lapply(grps, function(x) plot(induced_subgraph(g, x)))
Related
I would like to create temporal networks in R but the only resources I've found works with FR or KK graphs. However, my primary graph that I would like to base the layout from is a DRL layout. How could I code this in R to keep the layouts?
Thank you
Added:
Code:
drl <- layout.drl(netfull, options=list(simmer.attraction=0))
plot(netfull, edge.arrow.size=2, vertex.size=.5, vertex.label.cex=.3, vertex.label.dist=.1, vertex.lable.degree=pi, layout=drl)
plot(net7, edge.arrow.size=2, vertex.size=.5,vertex.label.cex=.3, vertex.label.dist=.1, vertex.lable.degree=pi, layout=drl)
You can just explicitly compute your layout before plotting and then use the layout argument when you want to plot. DRL is one of the standard options provided by igraph.
library(igraph)
## create test graph
set.seed(1234)
g = erdos.renyi.game(15, 0.2, type = "gnp")
## Create a reusable layout for the graph
LO = layout_with_drl(g)
## plot using the layout
plot(g, layout=LO)
Edit
Based on the discussion in the comments, I have a different understanding of the question. I think that the question is this: Given a graph g and a subgraph g2 print both g and g2 with the corresponding nodes in the same place. This extra response addresses that.
Start with the example above to create the graph g and the layout LO.
Now we want to take a subgraph and print it with the corresponding nodes in the same place. I will use as an example the graph that we get by removing nodes 2, 9, and 15.
If we simply remove those nodes, the new graph will have 12 nodes and they will have node IDs 1-12. In order to preserve the original numbering, we need to save the node IDs as labels.
V(g)$label = 1:15
Now let's create the subgraph by removing nodes 2,9 and 15.
g2 = induced_subgraph(g, V(g)[-c(2,9,15)])
We want to reuse the layout LO, but LO has the positions for all 15 original nodes. We want to select only the part for the remaining nodes in g2.
LO2 = LO[-c(2,9,15),]
Now we are ready to plot the original graph and the reduced graph so that the nodes line up.
par(mfrow=c(1,2), mar=c(2,1,2,1))
plot(g, layout=LO, frame=TRUE)
plot(g2, layout=LO2, frame=TRUE)
# Erdos
par(mfrow=c(1,2))
g <- erdos.renyi.game(100, 1/100)
V(g)$size<-seq(0.05,5,0.05)
betweenness(g)
# Draw nodes and save positions
locs <- layout.fruchterman.reingold(g)
plot(g,
layout=locs,
vertex.label=NA,
main="Original",
vertex.color=degree(g))
g
vertex.color=degree(g)
did not work. Could anyone tell me how to color the vertices by "degree"?
Red (high value) to blue (low value) would be perfect.
Thanks!
A solution I found is to create a new color vector with the grey color R provides us with colors()[]. If you check colors()[] in your terminal, you can see the full list of colors that are readable by the plot.igraph() function.
You first charge your data (graph, etc.) :
edgelist <- read.csv(...)
graph <- make_graph_from_data(edgelist)
Then you create a vector of colors that corresponds to the length of your vertices list :
length(V(g)) # with a length of X vertices :
colors <- c(paste0(rep("grey",X),seq(X,1)))
Finally, you plot it with the attribute vertex.color :
plot(g,vertex.color=colors[degree(graph)])
However, one can only use this little trick for graph with less than 100 values in degree(graph)...
I have an interaction network and I used the following code to make an adjacency matrix and subsequently calculate the dissimilarity between the nodes of the network and then cluster them to form modules:
ADJ1=abs(adjacent-mat)^6
dissADJ1<-1-ADJ1
hierADJ<-hclust(as.dist(dissADJ1), method = "average")
Now I would like those modules to appear when I plot the igraph.
g<-simplify(graph_from_adjacency_matrix(adjacent-mat, weighted=T))
plot.igraph(g)
However the only thing that I have found thus far to translate hclust output to graph is as per the following tutorial: http://gastonsanchez.com/resources/2014/07/05/Pretty-tree-graph/
phylo_tree = as.phylo(hierADJ)
graph_edges = phylo_tree$edge
graph_net = graph.edgelist(graph_edges)
plot(graph_net)
which is useful for hierarchical lineage but rather I just want the nodes that closely interact to cluster as follows:
Can anyone recommend how to use a command such as components from igraph to get these clusters to show?
igraph provides a bunch of different layout algorithms which are used to place nodes in the plot.
A good one to start with for a weighted network like this is the force-directed layout (implemented by layout.fruchterman.reingold in igraph).
Below is a example of using the force-directed layout using some simple simulated data.
First, we create some mock data and clusters, along with some "noise" to make it more realistic:
library('dplyr')
library('igraph')
library('RColorBrewer')
set.seed(1)
# generate a couple clusters
nodes_per_cluster <- 30
n <- 10
nvals <- nodes_per_cluster * n
# cluster 1 (increasing)
cluster1 <- matrix(rep((1:n)/4, nodes_per_cluster) +
rnorm(nvals, sd=1),
nrow=nodes_per_cluster, byrow=TRUE)
# cluster 2 (decreasing)
cluster2 <- matrix(rep((n:1)/4, nodes_per_cluster) +
rnorm(nvals, sd=1),
nrow=nodes_per_cluster, byrow=TRUE)
# noise cluster
noise <- matrix(sample(1:2, nvals, replace=TRUE) +
rnorm(nvals, sd=1.5),
nrow=nodes_per_cluster, byrow=TRUE)
dat <- rbind(cluster1, cluster2, noise)
colnames(dat) <- paste0('n', 1:n)
rownames(dat) <- c(paste0('cluster1_', 1:nodes_per_cluster),
paste0('cluster2_', 1:nodes_per_cluster),
paste0('noise_', 1:nodes_per_cluster))
Next, we can use Pearson correlation to construct our adjacency matrix:
# create correlation matrix
cor_mat <- cor(t(dat))
# shift to [0,1] to separate positive and negative correlations
adj_mat <- (cor_mat + 1) / 2
# get rid of low correlations and self-loops
adj_mat <- adj_mat^3
adj_mat[adj_mat < 0.5] <- 0
diag(adj_mat) <- 0
Cluster the data using hclust and cutree:
# convert to dissimilarity matrix and cluster using hclust
dissim_mat <- 1 - adj_mat
dend <- dissim_mat %>%
as.dist %>%
hclust
clusters = cutree(dend, h=0.65)
# color the nodes
pal = colorRampPalette(brewer.pal(11,"Spectral"))(length(unique(clusters)))
node_colors <- pal[clusters]
Finally, create an igraph graph from the adjacency matrix and plot it using the fruchterman.reingold layout:
# create graph
g <- graph.adjacency(adj_mat, mode='undirected', weighted=TRUE)
# set node color and plot using a force-directed layout (fruchterman-reingold)
V(g)$color <- node_colors
coords_fr = layout.fruchterman.reingold(g, weights=E(g)$weight)
# igraph plot options
igraph.options(vertex.size=8, edge.width=0.75)
# plot network
plot(g, layout=coords_fr, vertex.color=V(g)$color)
In the above code, I generated two "clusters" of correlated rows, and a third group of "noise".
Hierarchical clustering (hclust + cuttree) is used to assign the data points to clusters, and they are colored based on cluster membership.
The result looks like this:
For some more examples of clustering and plotting graphs with igraph, checkout: http://michael.hahsler.net/SMU/LearnROnYourOwn/code/igraph.html
You haven't shared some toy data for us to play with and suggest improvements to code, but your question states that you are only interested in plotting your clusters distinctly - that is, graphical presentation.
Although igraph comes with some nice force directed layout algorithms, such as layout.fruchterman.reingold, layout_with_kk, etc., they can, in presence of a large number of nodes, quickly become difficult to interpret and make sense of at all.
Like this:
With these traditional methods of visualising networks,
the layout algorithms, rather than the data, determine the visualisation
similar networks may end up being visualised very differently
large number of nodes will make the visualisation difficult to interpret
Instead, I find Hive Plots to be better at displaying important network properties, which, in your instance, are the cluster and the edges.
In your case, you can:
Plot each cluster on a different straight line
order the placement of nodes intelligently, so that nodes with certain properties are placed at the very end or start of each straight line
Colour the edges to identify direction of edge
To achieve this you will need to:
use the ggnetwork package to turn your igraph object into a dataframe
map your clusters to the nodes present in this dataframe
generate coordinates for the straight lines and map these to each cluster
use ggplot to visualise
There is also a hiveR package in R, should you wish to use a packaged solution. You might also find another visualisation technique for graphs very useful: BioFabric
I am analyzing an undirected graph in R. I'm trying to (eventually) write a function to get the ratio of the size (number of vertices) of the largest connected component to the size of the largest biconnected component - of any random graph. I was able to extract the size of the largest connected component, but am having trouble with the size of the largest biconnected component. I started off using the igraph function biconnected_components on graph g:
bicomponent_list <- biconnected_components(g)
bicomponent_list$components # lists all of the components, including size and vertex names
length(bicomponent_list$components[[1]]) # returns number of vertices of first bicomponent
Then my half-baked idea was to somehow order this list in decreasing number of vertices, so that I can always call length(bicomponent_list$components[[1]]) and it will be the largest biconnected component. But I don't know how to sort this correctly. Perhaps I have to convert it to a vector? But I also don't know how to specify that I want the number of vertices in the vector. Does anyone know, or have a better way to do it? Thanks so much!
library(igraph)
# generating sample graph
g1 <- barabasi.game(100, 1, 5)
V(g1)$name <- as.character(1:100)
g2 <- erdos.renyi.game(50, graph.density(g1), directed = TRUE)
V(g2)$name <- as.character(101:200)
g3 <- graph.union(g1, g2, byname = TRUE)
# analyzing the bicomponents
bicomponent_list <- biconnected_components(g3)
bi_list <- as.list(bicomponent_list$components)
bi_list <- lapply(bi_list, length) # lists the sizes of all of the components that I want to reorder
My desired outcome would be ordering bi_list such that length(bicomponent_list$components[[1]]) returns the bicomponent with the most vertices.
The components property is a list containing vertex lists. You can iterate over them and find the length of them like so
sapply(bicomponent_list$components, length)
and if you just wanted the largest, wrap that in a max()
max(sapply(bicomponent_list$components, length))
from a data.frame (or any other R object type), with 3 Columns: "Node, Parent and text", I'd like to plot a tree with rows from "Node" to "Parent" and "text" as label.
Can anyone suggest a good library to use and example code, if possible.
I've been looking at the igraph library, but all examples I could find plot trees with sequential numbers or letters as nodes and its not simple to set the tree layout.
Any help would be greatly appreciated
Thanks
EDIT:
Thanks guys for all your help, I really appreciate it.
Some extra comments, if you can help further
#md1630, I tried your suggestion but that's not what I'm looking for. The fist code plots the tree with the root on top and the arrows from root to leaf and the second corrects the arrows but inverts the tree. What I'd like is root on top and arrow from leafs to root (I understand that may not be a tree per say - but that's the requirement
#user20650 your solution looks correct but the image starts to get crowded as the number of nodes increase. Any idea on how to add more space between them?
#math Am I using the function you provided correctly? I called plot(layout.binary(g)) and got the result on the left. The one on the right is the output of plot(g)
upgrade comment
library(igraph)
# some example data
dat <- data.frame(parent=rep(letters[1:3], each=2),
node=letters[2:7],
text=paste0("lab", 1:6))
# create graph
g <- graph.data.frame(dat)
# plot
# layout.reingold.tilford gives a tree structure
# edge and vertx labels can be defined in the plot command or alternatively
# you can add them to the graph via V(g)$name and E(g($label assignments
plot(g, layout = layout.reingold.tilford,
edge.label=E(g)$text, vertex.label=paste0("v_lab",1:7))
EDIT re comment
If you want the direction to go from the leaves towards the root; you can first, get the tree layout coordinates from the more standard tree structure, and then reverse the edges.
# get tree layout coords
g <- graph.data.frame(dat)
lay = layout.reingold.tilford(g)
# redraw graph with edges reversed
g2 <- graph.data.frame(dat[2:1], vertices = get.data.frame(g, what="vertices"))
par(mar=rep(0,4), mfrow=c(1,2))
plot(g, layout=lay)
plot(g2, layout=lay)
You can use rgraphviz. Here's the code to plot the tree from a dataframe df with columns "Node, Parent and text". I didn't run this on my computer so there may be bugs. But roughly this is the idea:
source("http://bioconductor.org/biocLite.R")
biocLite("Rgraphviz")
library("Rgraphviz")
#first set up the graph with just the nodes
nodes<- unique(df['Node'])
gR <- new("graphNEL", nodes = nodes, edgemode = "directed")
#add edges for each row in df
for (j in (1:nrow(df))) {
gR <- addEdge(df[j,2], df[j,1], gR, 1)
}
#add text labels
nAttrs <- list()
z <- df['text']
nAttrs$label <- z
#plot
plot(gR, nodeAttrs = nAttrs) #you can specify more attributes here
You can use igraph to get a network with your data (supposing your dataframe is dd):
g = graph(t(dd[,2:1]))
V(g)$label = as.character(dd$text)
plot(g, layout=layout.binary)
I supposed your root (with no parents) is not in the dataframe, otherwise use dd[-1,2:1] instead.
If you want to have a tree, you can easily produce a layout, it is simply a function that takes a graph and return a matrix. For a binary tree :
layout.binary = function(graph) {
layout = c()
r_vertex = length(V(graph))
depth = ceiling(log2(r_vertex+1))
for (ii in 0:(depth-1)) {
for (jj in 1:min(2^ii, r_vertex)) {
layout = rbind(layout, c(ii, (2*(jj-1)+1)/(2^(ii+1))))
}
r_vertex = r_vertex - 2^ii
}
return(layout)
}
It will plot an horizontal tree, use c((2*(jj-1)+1)/(2^(ii+1)), ii) if you want it to be vertical.