Subset igraph plot in using activate function in R - r

I am trying to subset an igraph plot to display certain nodes based on a given vertex attribute. I have to subset them in the plot output to preserve the layout for the vertices. My code is the following:
plot.igraph(graph, layout=lo, vertex.label=NA, rescale=T, vertex.size = 4) %>%
tidygraph::activate(nodes) %>%
filter(period == 1)
But I receive the following error:
Error in UseMethod("activate") :
no applicable method for 'activate' applied to an object of class "NULL"
How can I subset the graph based on the vertex attribute "V(graph)$period", maintaining the vertices' layout?

Observe that class(plot(graph)) returns NULL.
Update, calculate subset as follows.
## Random example.
set.seed(20)
g <- make_ring(20)
V(g)$period <- sample(2, vcount(g), replace=TRUE)
V(g)$name <- V(g)
## Calculate subset of vertices
## and plot subgraph.
vvv <- V(g)[which(V(g)$period==1)]
g2 <- subgraph(g, vvv)
plot(g2)

Related

Hide vertices from plot.igraph conditional on vertex attribute without deleting them

I have an igraph plot that is geographically laid out based on its latitude and longitude coordinates. I now want to hide certain points from one time period, while preserving the layout of the graph. I would therefore not like to delete the vertices from the network, but merely make them invisible in this particular plot rendering, conditional on a vertex attribute. Furthermore, the color attribute is already set to capture another variable, so I cannot use that to hide the points.
My plot is generated according to the following code:
lo <- layout.norm(as.matrix(g[, c("longitude","latitude")]))
plot.igraph(g, layout=lo, vertex.label=NA,rescale=T, vertex.size = 4)
The time attribute is a numerical variable stored in V(g)$period
Is there code I can put within the plot.igraph function to hide vertices for which V(g)$period == 1?
Update.
Building upon Szabolcs's answer.
library(igraph)
## reproducible example
g <- make_graph("Zachary")
V(g)$name <- V(g)
set.seed(10)
lyt <- layout_with_drl(g)
V(g)$x <- lyt[,1]
V(g)$y <- lyt[,2]
plot(g)
del_vs <- c(4, 8, 9, 19, 24, 33)
dev.new(); plot(g - del_vs, main = paste("Zachary minus", toString(del_vs)))
Try invisible inkt, e.g. print hidden objects in background color.
Or try this.
library(igraph)
## reproducible example.
g <- make_graph("Zachary")
V(g)$name <- V(g)
set.seed(10)
lyt <- layout_with_drl(g)
plot(g, layout=lyt)
## delete vertices and preserve layout.
del_vs <- c(9, 19, 24, 33)
g2 <- g - del_vs
g2$main <- paste("Zachary minus", toString(del_vs))
g2$layout <- matrix(lyt[-del_vs,], ncol=2)
dev.new(); plot(g2)
See also:
Looking to save coordinates/layout to make temporal networks in Igraph with DRL
.
You can store the coordinates in the x and y vertex attributes. Then they will be used by plot automatically, and they will be preserved when you delete vertices.
For example:
g<-make_ring(4)
V(g)$x <- c(0,0,1,1)
V(g)$y <- c(0,1,0,1)
plot(g)
plot(delete_vertices(g,1))

group variables and color by type R igraph

I have a graph where I need the vertices to be in name, with a different color due to the type of data (stock, forex and commodities)... I don't understand how to do it...
in this post igraph group vertices based on community something similar is done... I need not circles, only letters and that they have a different color according to the type of data that is...
library(Hmisc) # For correlation matrix
library(corrplot) # For correlation matrix
library("Spillover")
library(readxl)
library("xts")
library("zoo") #
library("ggplot2")
library("nets")
library("MASS")
library("igraph")
library("reshape") # For "melt" function/cluster network
library("writexl")
# We compute the dynamic interdependence. Direct edges denoted as Granger causality linkages
# We compute the contemporaneous interdependence. Indirect edges denoted as Partial correlation linkages
stock <- table[1:55, 1:55]
forex <- table[56:95, 56:95]
commodities <- table[96:116, 96:116]
dim(stock); dim(forex); dim(commodities)
View(stock);
View(forex);
View(commodities)
# Full network
network.spill <- graph.adjacency(table, mode='directed')
degree <- degree(network.spill) # number of adjacent edges
between <- betweenness(network.spill)
close <- closeness(network.spill, mode = "all")
autorsco <- authority.score(network.spill)$vector
eccentry <- eccentricity(network.spill, mode = "all")
measures <- cbind(as.matrix(degree), as.matrix(between), as.matrix(close), as.matrix(autorsco), as.matrix(eccentry))
View(measures)
V(network.spill)[which(network.spill_forex)]$color="red"
V(network.spill)$size <- round((degree-min(degree))/(max(degree)-min(degree))) # To create vertex
V(network.spill)$shape <- "sphere"
E(network.spill)$name=1:116
# For stock returns layout_nicely(network.spill)
par(mfcol = c(1, 1))
plot( network.spill, layout = layout_nicely(network.spill), vertex.color = c("gold"), vertex.label.cex=0.6,
vertex.size = autorsco*2,edge.curved = 0.2, edge.arrow.mode=0.5,edge.arrow.size=1.5) # To make the chart

Create bipartite graph in R?

So this question has been asked here and here... but I cant seem to adapt it to my problem. I am trying to create a bipartite graph using the igraph package in R, that looks something like this:
The code im using to try this is:
# create all pairs and turn into vector for graph edges
pairs <- expand.grid(1:6, 1:6) # create all pairs
pairs <- pairs[!pairs$Var1 == pairs$Var2, ] # remove matching rows
ed <- as.vector(t(pairs)) # turn into vecotr
# create graph
g <- make_empty_graph(n = 6)
g <- add_edges(graph = g, edges = ed)
plot(g)
This will a create a graph... but im trying to make it resemble the graph in the image, with, say, (1,2,3) on the top and (4,5,6) on the bottom.
I tried using make_bipartite_graph() and layout_as_bipartite... but I cant seem to get it to work... any suggestions?
If the graph is created straight from the data.frame it will not be a bipartite graph.
library(igraph)
g <- graph_from_data_frame(df)
is.bipartite(g)
#[1] FALSE
But it will be a bipartite graph if created from the incidence matrix.
tdf <- table(df)
g <- graph.incidence(tdf, weighted = TRUE)
is.bipartite(g)
#[1] TRUE
Now plot it.
colrs <- c("green", "cyan")[V(g)$type + 1L]
plot(g, vertex.color = colrs, layout = layout_as_bipartite)

R Indexing a matrix to use in plot coordinates

I'm trying to plot a temporal social network in R. My approach is to create a master graph and layout for all nodes. Then, I will subset the graph based on a series of vertex id's. However, when I do this and layout the graph, I get completely different node locations. I think I'm either subsetting the layout matrix incorrectly. I can't locate where my issue is because I've done some smaller matrix subsets and everything seems to work fine.
I have some example code and an image of the issue in the network plots.
library(igraph)
# make graph
g <- barabasi.game(25)
# make graph and set some aestetics
set.seed(123)
l <- layout_nicely(g)
V(g)$size <- rescale(degree(g), c(5, 20))
V(g)$shape <- 'none'
V(g)$label.cex <- .75
V(g)$label.color <- 'black'
E(g)$arrow.size = .1
# plot graph
dev.off()
par(mfrow = c(1,2),
mar = c(1,1,5,1))
plot(g, layout = l,
main = 'Entire\ngraph')
# use index & induced subgraph
v_ids <- sample(1:25, 15, F)
sub_l <- l[v_ids, c(1,2)]
sub_g <- induced_subgraph(g, v_ids)
# plot second graph
plot(sub_g, layout = sub_l,
main = 'Sub\ngraph')
The vertices in the second plot should match layout of those in the first.
Unfortunately, you set the random seed after you generated the graph,
so we cannot exactly reproduce your result. I will use the same code but
with set.seed before the graph generation. This makes the result look
different than yours, but will be reproducible.
When I run your code, I do not see exactly the same problem as you are
showing.
Your code (with set.seed moved and scales added)
library(igraph)
library(scales) # for rescale function
# make graph
set.seed(123)
g <- barabasi.game(25)
# make graph and set some aestetics
l <- layout_nicely(g)
V(g)$size <- rescale(degree(g), c(5, 20))
V(g)$shape <- 'none'
V(g)$label.cex <- .75
V(g)$label.color <- 'black'
E(g)$arrow.size = .1
## V(g)$names = 1:25
# plot graph
dev.off()
par(mfrow = c(1,2),
mar = c(1,1,5,1))
plot(g, layout = l,
main = 'Entire\ngraph')
# use index & induced subgraph
v_ids <- sort(sample(1:25, 15, F))
sub_l <- l[v_ids, c(1,2)]
sub_g <- induced_subgraph(g, v_ids)
# plot second graph
plot(sub_g, layout = sub_l,
main = 'Sub\ngraph', vertex.label=V(sub_g)$names)
When I run your code, both graphs have nodes in the same
positions. That is not what I see in the graph in your question.
I suggest that you run just this code and see if you don't get
the same result (nodes in the same positions in both graphs).
The only difference between the two graphs in my version is the
node labels. When you take the subgraph, it renumbers the nodes
from 1 to 15 so the labels on the nodes disagree. You can fix
this by storing the node labels in the graph before taking the
subgraph. Specifically, add V(g)$names = 1:25 immediately after
your statement E(g)$arrow.size = .1. Then run the whole thing
again, starting at set.seed(123). This will preserve the
original numbering as the node labels.
The graph looks slightly different because the new, sub-graph
does not take up all of the space and so is stretched to use
up the empty space.
Possible fast way around: draw the same graph, but color nodes and vertices that you dont need in color of your background. Depending on your purposes it can suit you.

R getting subtrees from dendrogram based on cutree labels

I have clustered a large dataset and found 6 clusters I am interested in analyzing more in depth.
I found the clusters using hclust with "ward.D" method, and I would like to know whether there is a way to get "sub-trees" from hclust/dendrogram objects.
For example
library(gplots)
library(dendextend)
data <- iris[,1:4]
distance <- dist(data, method = "euclidean", diag = FALSE, upper = FALSE)
hc <- hclust(distance, method = 'ward.D')
dnd <- as.dendrogram(hc)
plot(dnd) # to decide the number of clusters
clusters <- cutree(dnd, k = 6)
I used cutree to get the labels for each of the rows in my dataset.
I know I can get the data for each corresponding cluster (cluster 1 for example) with:
c1_data = data[clusters == 1,]
Is there any easy way to get the subtrees for each corresponding label as returned by dendextend::cutree? For example, say I am interesting in getting the
I know I can access the branches of the dendrogram doing something like
subtree <- dnd[[1]][[2]
but how I can get exactly the subtree corresponding to cluster 1?
I have tried
dnd[clusters == 1]
but this of course doesn't work. So how can I get the subtree based on the labels returned by cutree?
================= UPDATED answer
This can now be solved using the get_subdendrograms from dendextend.
# needed packages:
# install.packages(gplots)
# install.packages(viridis)
# install.packages(devtools)
# devtools::install_github('talgalili/dendextend') # dendextend from github
# define dendrogram object to play with:
dend <- iris[,-5] %>% dist %>% hclust %>% as.dendrogram %>% set("labels_to_character") %>% color_branches(k=5)
dend_list <- get_subdendrograms(dend, 5)
# Plotting the result
par(mfrow = c(2,3))
plot(dend, main = "Original dendrogram")
sapply(dend_list, plot)
This can also be used within a heatmap:
# plot a heatmap of only one of the sub dendrograms
par(mfrow = c(1,1))
library(gplots)
sub_dend <- dend_list[[1]] # get the sub dendrogram
# make sure of the size of the dend
nleaves(sub_dend)
length(order.dendrogram(sub_dend))
# get the subset of the data
subset_iris <- as.matrix(iris[order.dendrogram(sub_dend),-5])
# update the dendrogram's internal order so to not cause an error in heatmap.2
order.dendrogram(sub_dend) <- rank(order.dendrogram(sub_dend))
heatmap.2(subset_iris, Rowv = sub_dend, trace = "none", col = viridis::viridis(100))
================= OLDER answer
I think what can be helpful for you are these two functions:
The first one just iterates through all clusters and extracts substructure. It requires:
the dendrogram object from which we want to get the subdendrograms
the clusters labels (e.g. returned by cutree)
Returns a list of subdendrograms.
extractDendrograms <- function(dendr, clusters){
lapply(unique(clusters), function(clust.id){
getSubDendrogram(dendr, which(clusters==clust.id))
})
}
The second one performs a depth-first search to determine in which subtree the cluster exists and if it matches the full cluster returns it. Here, we use the assumption that all elements of a cluster are in one subtress. It requires:
the dendrogram object
positions of the elements in cluster
Returns a subdendrograms corresponding to the cluster of given elements.
getSubDendrogram<-function(dendr, my.clust){
if(all(unlist(dendr) %in% my.clust))
return(dendr)
if(any(unlist(dendr[[1]]) %in% my.clust ))
return(getSubDendrogram(dendr[[1]], my.clust))
else
return(getSubDendrogram(dendr[[2]], my.clust))
}
Using these two functions we can use the variables you have provided in the question and get the following output. (I think the line clusters <- cutree(dnd, k = 6) should be clusters <- cutree(hc, k = 6) )
my.sub.dendrograms <- extractDendrograms(dnd, clusters)
plotting all six elements from the list gives all subdendrograms
EDIT
As suggested in the comment, I add a function that as an input takes a dendrogram dend and the number of subtrees k, but it still uses the previously defined, recursive function getSubDendrogram:
prune_cutree_to_dendlist <- function(dend, k, order_clusters_as_data=FALSE) {
clusters <- cutree(dend, k, order_clusters_as_data)
lapply(unique(clusters), function(clust.id){
getSubDendrogram(dend, which(clusters==clust.id))
})
}
A test case for 5 substructures:
library(dendextend)
dend <- iris[,-5] %>% dist %>% hclust %>% as.dendrogram %>% set("labels_to_character") %>% color_branches(k=5)
subdend.list <- prune_cutree_to_dendlist(dend, 5)
#plotting
par(mfrow = c(2,3))
plot(dend, main = "original dend")
sapply(prunned_dends, plot)
I have performed some benchmark using rbenchmark with the function suggested by Tal Galili (here named prune_cutree_to_dendlist2) and the results are quite promising for the DFS approach from the above:
library(rbenchmark)
benchmark(prune_cutree_to_dendlist(dend, 5),
prune_cutree_to_dendlist2(dend, 5), replications=5)
test replications elapsed relative user.self
1 prune_cutree_to_dendlist(dend, 5) 5 0.02 1 0.020
2 prune_cutree_to_dendlist2(dend, 5) 5 60.82 3041 60.643
I wrote now function prune_cutree_to_dendlist to do what you asked for. I should add it to dendextend at some point in the future.
In the meantime, here is an example of the code and output (the function is a bit slow. Making it faster relies on having prune be faster, which I won't get to fixing in the near future.)
# install.packages("dendextend")
library(dendextend)
dend <- iris[,-5] %>% dist %>% hclust %>% as.dendrogram %>%
set("labels_to_character")
dend <- dend %>% color_branches(k=5)
# plot(dend)
prune_cutree_to_dendlist <- function(dend, k) {
clusters <- cutree(dend,k, order_clusters_as_data = FALSE)
# unique_clusters <- unique(clusters) # could also be 1:k but it would be less robust
# k <- length(unique_clusters)
# for(i in unique_clusters) {
dends <- vector("list", k)
for(i in 1:k) {
leves_to_prune <- labels(dend)[clusters != i]
dends[[i]] <- prune(dend, leves_to_prune)
}
class(dends) <- "dendlist"
dends
}
prunned_dends <- prune_cutree_to_dendlist(dend, 5)
sapply(prunned_dends, nleaves)
par(mfrow = c(2,3))
plot(dend, main = "original dend")
sapply(prunned_dends, plot)
How did you get 6 clusters using hclust? You can cut the tree at any point, so you just ask cuttree to give you more clusters:
clusters = cutree(hclusters, number_of_clusters)
If you have a lot of data this may not be very handy though. In these cases what I do is manually picking the clusters that I want to study further and then running hclust only on the data in these clusters. I don't know of any functionality in hclust that allows you to do this automatically, but it's quite easy:
good_clusters = c(which(clusters==1),
which(clusters==2)) #or whichever cLusters you want
new_df = df[good_clusters,]
new_hclusters = hclust(new_df)
new_clusters = cutree(new_hclusters, new_number_of_clusters)

Resources