Phylogenetic tree tip color - r

I have a phylogenetic tree that I drew on R. I want to color my tip edges based on the order of my species. How can I choose the color of every tip label alone?
I tried first:
EdgeCols <- rep("black", Nedge(tree))
EdgeCols[which.edge(tree, tree$edge[1]) ] <- "red"
plot( tree, space = 30, assoc = AMat,
show.tip.label = T, gap = 1, length.line = 0, edge.color =EdgeCols1)
But I would not get any change in the color of this edge.
Can anyone tell me where the problem is?

I am not exactly sure what you are trying to do, but here is how to color specific edges of a phylogeny with the ape package. Here is code for coloring all edges:
library(ape)
# Simulate tree
ntax <- 20
tree <- rcoal(ntax)
# Color branches
colors <- rainbow(Nedge(tree))
plot(tree, edge.color=colors)
And for coloring all terminal branches:
# Color terminal branches
colors2 <- rep("black", Nedge(tree))
colors2[which(tree$edge[,2] %in% 1:20)] <- rainbow(ntax)
plot(tree, edge.color=colors2)
I also would point out that there are obvious issues in your code:
You have tree$edge[1], but tree$edge is a matrix, so you can't index it with one value.
The which.edge function requires a vector of tips and returns the index of all the edges within the monophyletic clade defined by those tips. It seems like you are trying to give it a single value, which doesn't make any sense.
You define EdgeCols, but then in your plot function you have EdgeCols1.

Related

Defining Node shape in a Network plot with an additional attribute table in R

I am working on plotting a Network and it contains two different types of Nodes which I want to visualise with different shapes. For that I made an additional table in which I specified which structure is which type using a binary system. Now I want to specify in my plot function that the structures with 1 are to be triangles and the ones with 0 as circles.
My data for the Network is in the format of an adjacency matrix (I use igraph) and I am using ggnet2 for the plotting of it.
this is how I imported the data:
am <- as.matrix(read.csv2("mydata.csv", header = T, row.names = 1))
g <- graph_from_adjacency_matrix(am, mode = "undirected")
attr <- read.csv2("myattributes.csv", header = T, row.names = 1)
this is how I would plot it but I dont know how to specify the shape function
ggnet2(g, size = "degree", node.color = "darkgreen", shape = ??????)
Thanks in advance for your help!
Note that the package-requirements for plotting igraphs with ggnet2 include ggplot2, sna and network as well as intergraph as a bridge.
ggnet2 is prettier, sure, but the igraph-way is this:
g <- erdos.renyi.game(100,100,'gnm')
V(g)$shape <- sample(c('csquare','circle'), 100, replace=T)
plot(g, vertex.label = NA)
Note that I added two igraph-style shapes as vertex-attributes to g above. In ggent2 you can provide a vector with shapes, but they can be any values (even a factor), or numbers (the usual gray circle is 19. Try this out to plot in ggnet2
ggnet2(g, shape=19)
ggnet2(g, shape=10+round(1:100/10))
ggnet2(g, shape=factor(V(g)$shape))
V(g)$shape <- sample(c('One shape','Another shape'), 100, replace=T)
ggnet2(g, shape=V(g)$shape, size = "degree", node.color = "darkgreen")
Note that, if you add attributes to your vertices after separately loading attribute data (as you do above), it may be so that the very order of your data matters. Make sure your table import actually works as intended with the correct attribute being assigned to the correct vertex. I find it a good practice to tie all values as attributes on the igraph-object (edge- and vertex attributes alike) rather than letting the network data live in different dataframes or loose vectors to be combined in order to correctly visualise a network.

Phylogenetic tree ape too small

I am building a phylogenetic tree using data from NCBI taxonomy. The tree is quite simple, its aims is to show the relationship among a few Arthropods.
The problem is the tree looks very small and I can't seem to make its branches longer. I would also like to color some nodes (Ex: Pancrustaceans) but I don't know how to do this using ape.
Thanks for any help!
library(treeio)
library(ape)
treeText <- readLines('phyliptree.phy')
treeText <- paste0(treeText, collapse="")
tree <- read.tree(text = treeText) ## load tree
distMat <- cophenetic(tree) ## generate dist matrix
plot(tree, use.edge.length = TRUE,show.node.label = T, edge.width = 2, label.offset = 0.75, type = "cladogram", cex = 1, lwd=2)
Here are some pointers using the ape package. I am using a random tree as we don't have access to yours, but these examples should be easily adaptable to your problem. If your provide a reproducible example of a specific question, I could take another look.
First me make a random tree, add some species names, and plot it to show the numbers of nodes (both terminal and internal)
library(ape)
set.seed(123)
Tree <- rtree(10)
Tree$tip.label <- paste("Species", 1:10, sep="_")
plot.phylo(Tree)
nodelabels() # blue
tiplabels() # yellow
edgelabels() # green
Then, to color any node or edge of the tree, we can create a vector of colors and provide it to the appropriate *labels() function.
# using numbers for colors
node_colors <- rep(5:6, each=5)[1:9] # 9 internal nodes
edge_colors <- rep(3:4, each=9) # 18 branches
tip_colors <- rep(c(11,12,13), 4)
# plot:
plot.phylo(Tree, edge.color = edge_colors, tip.color = tip_colors)
nodelabels(pch = 21, bg = node_colors, cex=2)
To label just one node and the clade descending from it, we could do:
Nnode(Tree)
node_colors <- rep(NA, 9)
node_colors[7] <- "green"
node_shape <- ifelse(is.na(node_colors), NA, 21)
edge_colors <- rep("black", 18)
edge_colors[13:18] <- "green"
plot(Tree, edge.color = edge_colors, edge.width = 2, label.offset = .1)
nodelabels(pch=node_shape, bg=node_colors, cex=2)
Without your tree, it is harder to tell how to adjust the branches. One way is to reduce the size of the tip labels, so they take up less space. Another way might be to play around when saving the png or pdf.
There are other ways of doing these embellishments of trees, including the ggtree package.

Color bar legend for values on vertices of igraph in R

I am new in R and I am starting to work on graph visualization over there using igraph. The example below create a simple network of 10 vertices and color them according to color values (which in this case for simplicity I set up to be the same as ids of vertices).
library(igraph)
vertices <- 1:10
first <- 1:10
second <- c(2:10,1)
edges = cbind(first,second)
color = 1:10
net = graph_from_data_frame(edges,vertices=vertices ,directed=F )
V(net)$color = color
plot(net)
However from this plot it is not clear which colors correspond to
which numbers:
To deal with this I have tried to create various
legends I was able to find in the documentation and online. Take for
instance the code below:
legend("bottom", legend=levels(as.factor(color)), bty = "n", cex =
1.5, pt.cex = 3, pch=20, col = color , horiz = FALSE , inset = c(0.1,
-0.3)
But in this case, the result is messy, obscure the picture, and do not provide a continuous color bar that would map the range of values on the nodes to color spectrum. Other options I was able to find are not better.
Do you know how to make a legend in a form of a continuous color bar placed below or to the right from the picture (so that it do not cover any part of it)? Ideally the color bar should show the whole continuous spectrum of colors and a few values corresponding to the colors (at least the extreme ones)?
Do you happen to know how to achieve this?
Thank you for your help!
You should check out this answer by kokkenbaker,although it is a bit cumbersome, it might be just what you need.
How to add colorbar with perspective plot in R
Thanks to ealbsho93 I was able to produce the following solution. It create a pallete, then map the values on the vertices on the graph to the pallete and displays it. It is not straightforward, but the result looks much better (see below)
rm(list=ls())
library(igraph)
library(fields)
vertices <- 1:10
first <- 1:10
second <- c(2:10,1)
edges = cbind(first,second)
net = graph_from_data_frame(edges,vertices=vertices ,directed=F )
#Here we create a sample function on the vertices of the graph
color_num = 10:1
#create a color palette of the same size as the number of vertices.
jet.colors <- colorRampPalette( rainbow( length( unique(color_num) ) ) )
color_spectrum <- jet.colors( length( unique(color_num ) ) )
#and over here we map the pallete to the order of values on vertices
ordered <- order(color_num)
color <- vector(length = length(ordered),mode="double")
for ( i in 1:length(ordered) )
{
color[ ordered[i] ] <- color_spectrum [ i ]
}
V(net)$color = color
#Display the graph and the legend.
plot(net)
image.plot(legend.only=T, zlim=range(color_num), col=color_spectrum )
If there is a better solution, please let me know. Othervise, this one seems to be OK to use.

R Indexing a matrix to use in plot coordinates

I'm trying to plot a temporal social network in R. My approach is to create a master graph and layout for all nodes. Then, I will subset the graph based on a series of vertex id's. However, when I do this and layout the graph, I get completely different node locations. I think I'm either subsetting the layout matrix incorrectly. I can't locate where my issue is because I've done some smaller matrix subsets and everything seems to work fine.
I have some example code and an image of the issue in the network plots.
library(igraph)
# make graph
g <- barabasi.game(25)
# make graph and set some aestetics
set.seed(123)
l <- layout_nicely(g)
V(g)$size <- rescale(degree(g), c(5, 20))
V(g)$shape <- 'none'
V(g)$label.cex <- .75
V(g)$label.color <- 'black'
E(g)$arrow.size = .1
# plot graph
dev.off()
par(mfrow = c(1,2),
mar = c(1,1,5,1))
plot(g, layout = l,
main = 'Entire\ngraph')
# use index & induced subgraph
v_ids <- sample(1:25, 15, F)
sub_l <- l[v_ids, c(1,2)]
sub_g <- induced_subgraph(g, v_ids)
# plot second graph
plot(sub_g, layout = sub_l,
main = 'Sub\ngraph')
The vertices in the second plot should match layout of those in the first.
Unfortunately, you set the random seed after you generated the graph,
so we cannot exactly reproduce your result. I will use the same code but
with set.seed before the graph generation. This makes the result look
different than yours, but will be reproducible.
When I run your code, I do not see exactly the same problem as you are
showing.
Your code (with set.seed moved and scales added)
library(igraph)
library(scales) # for rescale function
# make graph
set.seed(123)
g <- barabasi.game(25)
# make graph and set some aestetics
l <- layout_nicely(g)
V(g)$size <- rescale(degree(g), c(5, 20))
V(g)$shape <- 'none'
V(g)$label.cex <- .75
V(g)$label.color <- 'black'
E(g)$arrow.size = .1
## V(g)$names = 1:25
# plot graph
dev.off()
par(mfrow = c(1,2),
mar = c(1,1,5,1))
plot(g, layout = l,
main = 'Entire\ngraph')
# use index & induced subgraph
v_ids <- sort(sample(1:25, 15, F))
sub_l <- l[v_ids, c(1,2)]
sub_g <- induced_subgraph(g, v_ids)
# plot second graph
plot(sub_g, layout = sub_l,
main = 'Sub\ngraph', vertex.label=V(sub_g)$names)
When I run your code, both graphs have nodes in the same
positions. That is not what I see in the graph in your question.
I suggest that you run just this code and see if you don't get
the same result (nodes in the same positions in both graphs).
The only difference between the two graphs in my version is the
node labels. When you take the subgraph, it renumbers the nodes
from 1 to 15 so the labels on the nodes disagree. You can fix
this by storing the node labels in the graph before taking the
subgraph. Specifically, add V(g)$names = 1:25 immediately after
your statement E(g)$arrow.size = .1. Then run the whole thing
again, starting at set.seed(123). This will preserve the
original numbering as the node labels.
The graph looks slightly different because the new, sub-graph
does not take up all of the space and so is stretched to use
up the empty space.
Possible fast way around: draw the same graph, but color nodes and vertices that you dont need in color of your background. Depending on your purposes it can suit you.

Adding a legend in visNetwork for edge color

I am using visNetwork to visualize a graph, but I need to introduce a legend based on the edge color. edge color is dependent on an edge attribute and its dynamic in nature. I tried to do with visGroups/visLegend, but got multiple errors.
PFB reproducible example.
library(igraph)
library(visNetwork)
gg <- graph.atlas(711)
V(gg)$name=1:7
gg=set_edge_attr(gg,"Department",E(gg)[1:10],c("A","B","C","A","E","C","G","B","C","A"))
E(gg)$label=E(gg)$Department
F2 <- colorRampPalette(c("red", "blue","orange","violet","cyan"), bias = length(unique(E(gg)$Department)), space = "rgb", interpolate = "linear")
colCodes <- F2(length(unique(E(gg)$Department)))
edges_col <- sapply(E(gg)$Department,function(x) colCodes[which(sort(unique(E(gg)$Department)) == x)])
E(gg)$color <-edges_col
datatest = toVisNetworkData(gg)
visNetwork(datatest$nodes,datatest$edges) %>% visIgraphLayout(layout="layout_in_circle")
I need legends as Red - A , Blue - C etc.
Kindly help.
visLegend is first based on nodes groups, but you can also set nodes and/or edges legend manually. (?visLegend)
For you for example :
# data.frame from edges legend
ledges <- data.frame(color = unique(edges_col),
label = unique(names(edges_col)))
visNetwork(datatest$nodes,datatest$edges) %>% visIgraphLayout(layout="layout_in_circle") %>%
visLegend(useGroups = F, addEdges = ledges)

Resources