igraph use of %>% as continuation - r

I am using igraph with R
trying to create a graph with labeled vertices.
THe igraph docs for "paths" give the example code
g <- make_empty_graph(directed = FALSE, n = 10) %>%
set_vertex_attr("name", value = letters[1:10])
plot(g) # labels the vertices with letters a through j
However
g <- make_empty_graph(directed = FALSE, n = 10)
set_vertex_attr(g, "name", value = letters[1:10])
plot(g) # now labels the vertices as numbers 1:10
Why?
The igraph docs for "set_vertex_attributs" use
set_vertex_attr(g, "name", value = letters[1:10])
to set vertex attributes.

Related

Bipartite graph projection with nodes as edge attributes

I have a bipartite graph and I want the projections of this graph to have edge attributes that record via which nodes they were connected. For example:
require(igraph)
set.seed(123)
g <- sample_bipartite(5, 5, p =.5)
V(g)$name <- c(letters[1:5], 1:5)
g1 <- bipartite_projection(g)[[1]]
g2 <- bipartite_projection(g)[[2]]
par(mfrow = c(1, 3))
plot(g,
vertex.shape = ifelse(V(g)$type == FALSE, "square", "circle"),
vertex.color = ifelse(V(g)$type == FALSE, "gold", "tomato"),
main = "Bipartite")
plot(g1,
main = "Projection 1")
plot(g2,
main = "Projection 2")
par(mfrow = c(1, 1))
I want the information that I added by hand to the plot to be in the network object. It it easily done in igraph? Thanks.
With bipartite_projection
If you really want ot use bipartite_projection, you can try to define your custom function f like below:
f <- function(gp) {
df <- get.data.frame(gp)[1:2]
df$lbl <- apply(
df,
1,
function(v) {
max(do.call(intersect, unname(lapply(v, function(x) names(neighbors(g, x))))))
}
)
res <- graph_from_data_frame(df, directed = FALSE)
plot(res, edge.label = E(res)$lbl)
}
f(g1)
f(g2)
which gives
Without bipartite_projection
Below is an option without using bipartite_projection (take g1 as the an example, and g2 can be obtained in a similar way)
g1 <- simplify(
graph_from_data_frame(
do.call(
rbind,
lapply(
Filter(
function(x) nrow(x) > 1,
split(get.data.frame(g), ~to)
),
function(d) {
with(
d,
cbind(data.frame(t(combn(from, 2))), weight = unique(to))
)
}
)
),
directed = FALSE
),
edge.attr.comb = "max"
)
and plot(g1, edge.label = E(g1)$weight) gives
First, I made a dataframe of the as.edgelist results, then computed a label with paste0. Next, I used the edge_attr command to write the labels to the graph object.
el<-igraph::as_edgelist(g);el<-as.data.frame(el)
el$lab<-paste0(el$V1,"_",el$V2)
edge_attr(g,"label")<-el$lab
E(g)$label
set.seed(232)
plot(g,
edge.label.dist=.3,
edge.label.color="blue",
margin=-0.4,
layout=layout.fruchterman.reingold)

How to calculate a maximum-bottleneck path with igraph?

Given a capacity network with a single source and a single sink, how can I calculate the maximum-bottleneck path (also known as the widest path or maximum capacity path problem) using igraph?
I've read (e.g. here or even with pseudocode there) that it is possible with some modifications to Dijkstra’s algorithm, but I do not want to dive into algortihm development but use igraph instead.
Example
library(igraph)
set.seed(21)
nodes = cbind(
'id' = c('Fermenters', 'Methanogens', 'carbs', 'CO2', 'H2', 'other', 'CH4', 'H2O')
)
from <- c('carbs', rep('Fermenters', 3), rep('Methanogens', 2), 'CO2', 'H2')
to <- c('Fermenters', 'other', 'CO2', 'H2', 'CH4', 'H2O', rep('Methanogens', 2))
weight <- sample(1 : 20, 8)
links <- data.frame(from, to, weight, stringsAsFactors = FALSE)
net = graph_from_data_frame(links, vertices = nodes, directed = T)
## Calculate max-bottleneck here !
# # disabled because just vis
# plot(net, edge.width = E(net)$weight)
# require(networkD3)
# require(tidyverse)
#
# d3net <- igraph_to_networkD3(net, group = rep(1, 8))
# forceNetwork(
# Links = mutate(d3net$links, weight = E(net)$weight), Nodes = d3net$nodes,
# Source = 'source', Target = 'target',
# NodeID = 'name', Group = "group", Value = "weight",
# arrows = TRUE, opacity = 1, opacityNoHover = 1
# )
So with respect to the example, how would I calculate the maximum capacity path from carbs to H2O?
I don't know how efficient this would be, but you could use igraph to find all "simple" paths, then calculate the minimum edge weight of each, then choose the max...
require(tibble)
require(igraph)
nodes = data_frame('id' = c('A', "B", "C", "D"))
links = tribble(
~from, ~to, ~weight,
"A" , "B", 10,
"B", "C", 10,
"C", "D", 6,
"A", "D", 4,
)
net = graph_from_data_frame(links, vertices = nodes, directed = T)
simple_paths <- all_simple_paths(net, "A", "D")
simple_paths[which.max(
sapply(simple_paths, function(path) {
min(E(net, path = path)$weight)
})
)]
# [[1]]
# + 4/4 vertices, named, from 061ab8d:
# [1] A B C D
You could try the same idea as in IGRAPH IN R: Find the path between vertices that maximizes the product of edge attributes. Invert the weights, divide by the total to keep the weights < 1 (to keep the log-weights positive), and take the min:
x<-shortest_paths(net,3,8, weights=-log(E(net)$weight/sum(E(net)$weight)), output="epath")[[2]]
E(net)[x[[1]]]
min(E(net)$weight[x[[1]]])
which gives
+ 4/8 edges from 57589bc (vertex names):
[1] carbs ->Fermenters Fermenters ->H2 H2 ->Methanogens Methanogens->H2O
[1] 10

R igraph: label vertex by condition

I have a graph g with a set of vertices and a list with the names of some "special" vertices.
I want the graph to display a label with the vertex name ONLY for these special vertices.
I tried something like this:
plot(g, vertex.size = 4, vertex.label = ifelse(V(g) %in% usernames, V(g)$label, ""), asp = F)
But apparently I'm missing something, because the predicate never enter TRUE.
Also, it seems like V(g)$label results in the following error:
Error in ans[test & ok] <- rep(yes, length.out = length(ans))[test & ok] : replacement has length zero
Your help will be appreciated! :)
Turn V(g) into V(g)$label?
set.seed(1)
library(igraph)
g <- ba.game(26)
V(g)$label <- letters[1:26]
usernames <- sample(letters, 5)
plot(g, vertex.label = ifelse(V(g)$label %in% usernames, V(g)$label, NA))
or
set.seed(1)
library(igraph)
g <- ba.game(26)
usernames <- sample(vcount(g), 5)
plot(g, vertex.label = ifelse(V(g) %in% usernames, V(g), NA))
?

How to color branches in R dendogram as a function of the classes in it?

I wish to visualize how well a clustering algorithm is doing (with certain distance metric). I have samples and their corresponding classes.
To visualize, I cluster and I wish to color the branches of a dendrogram by the items in the cluster. The color will be the color most items in the hierarchical cluster correspond to (given by the data\classes).
Example: If my clustering algorithm chose indexes 1,21,24 to be a certain cluster (at a certain level) and I have a csv file containing a class number in each row corresponding to lets say 1,2,1. I want this edge to be coloured 1.
Example Code:
require(cluster)
suppressPackageStartupMessages(library(dendextend))
dir <- 'distance_metrics/'
filename <- 'aligned.csv'
my.data <- read.csv(paste(dir, filename, sep=""), header = T, row.names = 1)
my.dist <- as.dist(my.data)
real.clusters <-read.csv("clusters", header = T, row.names = 1)
clustered <- diana(my.dist)
# dend <- colour_branches(???dend, max(real.clusters)???)
plot(dend)
EDIT:
another example partial code
dir <- 'distance_metrics/' # csv in here contains a symmetric matrix
clust.dir <- "clusters/" #csv in here contains a column vector with classes
my.data <- read.csv(paste(dir, filename, sep=""), header = T, row.names = 1)
filename <- 'table.csv'
my.dist <- as.dist(my.data)
real.clusters <-read.csv(paste(clust.dir, filename, sep=""), header = T, row.names = 1)
clustered <- diana(my.dist)
dnd <- as.dendrogram(clustered)
Both node and edge color attributes can be set recursively on "dendrogram" objects (which are just deeply nested lists) using dendrapply. The cluster package also features an as.dendrogram method for "diana" class objects, so conversion between the object types is seamless. Using your diana clustering and borrowing some code from #Edvardoss iris example, you can create the colored dendrogram as follows:
library(cluster)
set.seed(999)
iris2 <- iris[sample(x = 1:150,size = 50,replace = F),]
clust <- diana(iris2)
dnd <- as.dendrogram(clust)
## Duplicate rownames aren't allowed, so we need to set the "labels"
## attributes recursively. We also label inner nodes here.
rectify_labels <- function(node, df){
newlab <- df$Species[unlist(node, use.names = FALSE)]
attr(node, "label") <- (newlab)
return(node)
}
dnd <- dendrapply(dnd, rectify_labels, df = iris2)
## Create a color palette as a data.frame with one row for each spp
uniqspp <- as.character(unique(iris$Species))
colormap <- data.frame(Species = uniqspp, color = rainbow(n = length(uniqspp)))
colormap[, 2] <- c("red", "blue", "green")
colormap
## Now color the inner dendrogram edges
color_dendro <- function(node, colormap){
if(is.leaf(node)){
nodecol <- colormap$color[match(attr(node, "label"), colormap$Species)]
attr(node, "nodePar") <- list(pch = NA, lab.col = nodecol)
attr(node, "edgePar") <- list(col = nodecol)
}else{
spp <- attr(node, "label")
dominantspp <- levels(spp)[which.max(tabulate(spp))]
edgecol <- colormap$color[match(dominantspp, colormap$Species)]
attr(node, "edgePar") <- list(col = edgecol)
}
return(node)
}
dnd <- dendrapply(dnd, color_dendro, colormap = colormap)
## Plot the dendrogram
plot(dnd)
The function you are looking for is color_brances from the dendextend R package, using the arguments clusters and col. Here is an example (based on Shaun Wilkinson's example):
library(cluster)
set.seed(999)
iris2 <- iris[sample(x = 1:150,size = 50,replace = F),]
clust <- diana(iris2)
dend <- as.dendrogram(clust)
temp_col <- c("red", "blue", "green")[as.numeric(iris2$Species)]
temp_col <- temp_col[order.dendrogram(dend)]
temp_col <- factor(temp_col, unique(temp_col))
library(dendextend)
dend %>% color_branches(clusters = as.numeric(temp_col), col = levels(temp_col)) %>%
set("labels_colors", as.character(temp_col)) %>%
plot
there are suspicions that misunderstood the question however I'll try to answer:
from my previous objectives were rewritten by the example of iris
clrs <- rainbow(n = 3) # create palette
clrs <- clrs[iris$Species] # assign colors
plot(x = iris$Sepal.Length,y = iris$Sepal.Width,col=clrs) # simple test colors
# cluster
dt <- cbind(iris,clrs)
dt <- dt[sample(x = 1:150,size = 50,replace = F),] # create short dataset for visualization convenience
empty.labl <- gsub("."," ",dt$Species) # create a space vector with length of names intended for reserve place to future text labels
dst <- dist(x = scale(dt[,1:4]),method = "manhattan")
hcl <- hclust(d = dst,method = "complete")
plot(hcl,hang=-1,cex=1,labels = empty.labl, xlab = NA,sub=NA)
dt <- dt[hcl$order,] # sort rows for order objects in dendrogramm
text(x = seq(nrow(dt)), y=-.5,labels = dt$Species,srt=90,cex=.8,xpd=NA,adj=c(1,0.7),col=as.character(dt$clrs))

Subset graph based on edges weight

I have a graph, G=(V,E) with several attributes including an edge weight attribute. I'm trying to create a subgraph based on a condition where weights are higher than x.
I've tried the standard R subsetting option with g <- E(g)[weight > max(weight)*.10], but I always get a vector.
I'm not sure what I'm doing wrong here.
Maybe you want something like this
library(igraph)
set.seed(1)
m <- matrix(sample(c(.5, 2, 5), 100, replace=T, prob = c(.6,.3,.1)), nc=10, dimnames = rep(list(letters[1:10]), 2))
g <- graph_from_adjacency_matrix(m, weighted=T, diag=F, mode="undirected")
coords <- layout.auto(g)
par(mfrow = c(1,3))
plot(g, layout=coords, edge.width = E(g)$weight)
s1 <- subgraph.edges(g, E(g)[E(g)$weight>2], del=F)
plot(s1, layout=coords, edge.width = E(s1)$weight)
s2 <- delete_vertices(s1, degree(s1, mode = "in")==0)
plot(s2, layout=coords[V(g)$name%in%V(s2)$name,], edge.width = E(s2)$weight)
That would be because you replaced your graph g with just subsetted edges. If you want to remove edges below the threshold weight, you can use:
g_sub <- delete.edges(g, E(g)[weight <= max(weight)*.10])

Resources