I'm trying to plot a circular dendrogram of compositional data. Using the following code:
library(dendextend)
library(circlize)
library(compositions)
data("Hydrochem")
hydro<-Hydrochem
d <- dist(hydro[7:19], method="euclidean")
hc <- hclust(d, method = "average")
dend <- as.dendrogram(hc)
hydro$River <- as.character(hydro$River)
labels(dend) <- hydro$River[order.dendrogram(dend)]
plot(dend)
I can get a normal dendrogram of what I want with the correct label orders.
But when I run circlize_dendrogram(dend), I get this:
What's vexing me is the dendrogram in the middle - when I don't use the order of the dendrogram for the labels (i.e. just typing labels(dend) <- hydro$River), the inner dendrogram is fine and everything looks great.
I've tried altering the labels_track_height and dend_track_height settings to no avail, and when I run the same process on smaller toy datasets this issue doesn't arise.
Any ideas?
So you actually have two problems surfacing in your code:
1. The labels are not unique.
2. The plot does not give enough room for the labels, after you've updated them in the dendrogram object
The first problem can be solved by adding numbers to the non-unique labels you supply, thus making them unique. The solution for the second problem is to play with the labels_track_height argument in the circlize_dendrogram function. Here is the updated code (notice the last line, where the difference is):
library(dendextend)
library(circlize)
library(compositions)
data("Hydrochem")
hydro<-Hydrochem
d <- dist(hydro[7:19], method="euclidean")
hc <- hclust(d, method = "average")
dend <- as.dendrogram(hc)
tmp <- as.character(hydro$River)[order.dendrogram(dend)]
labels(dend) <- paste0(seq_along(tmp), "_", tmp)
plot(dend)
circlize_dendrogram(dend, labels_track_height = 0.4)
The output you get is this:
(This is now done automatically in dendextend 1.6.0, currently available on github - and later on also on CRAN)
So, the solution to this problem (if anyone can provide more details please do, because I don't really understand why this matters at all) is to add a second dend <- as.dendrogram(hc) call after defining the labels. So, the code looks like this:
d <- dist(hydro[7:19], method="euclidean")
hc <- hclust(d, method = "average")
dend <- as.dendrogram(hc)
hydro$River <- as.character(hydro$River)
labels(dend) <- hydro$River[order.dendrogram(dend)]
dend <- as.dendrogram(hc)
circlize_dendrogram(dend)
NOTE by another user: this does not solve the question.
Related
I am not able to add colored rectangles around the chosen clusters.
library(lattice)
library(permute)
library(vegan)
library("ggplot2")
library("ggdendro")
library("dendextend")
data(dune)
d <- vegdist(dune)
csin <- hclust(d, method = "aver")
ggdendrogram(csin)
rect.dendrogram(csin, 3, border = 1:3)
I get this answer:
"Error in rect.dendrogram(csin, 3, border = 1:3) :
x is not a dendrogram object."
Although csin is the dendrogram object. Does anyone have a clue?
As I wrote in the comments:
csin is hclust and not a dendrogram (use as.dendrogram to make it into a dendrogram)
rect.dendrogram works with base R plots, not ggplot2.
Here is a simple example of making your rect.dendrogram work:
library("dendextend")
d <- dist(iris[,-5])
csin <- as.dendrogram(hclust(d, method = "aver"))
plot(csin)
rect.dendrogram(csin, 3, border = 1:3)
The output:
I know this is a little bit too much, but I am plotting a dendrogram plot in r, and here is my code:
dd <- dist(scale(full[,c(1,2,3,4)]),method="euclidean")
hc = hclust(dd,method="ward.D2")
dend <- color_branches(as.dendrogram(hc),6)
labels_colors(dend) <-
rainbow_hcl(6)[sort_levels_values(
as.numeric(classified[, 9])[order.dendrogram(dend)]
)]
plot(dend,horiz=T)
and I got this plot:
Is there any way can do mirror symmetry to make it like this:(please ignore the difference in colour)
plot_horiz.dendrogram(dend, side = TRUE)
should do the trick. See https://rdrr.io/cran/dendextend/f/vignettes/FAQ.Rmd
In the following example:
hc <- hclust(dist(mtcars))
hcd <- as.dendrogram((hc))
hcut4 <- cutree(hc,h=200)
class(hcut4)
plot(hcd,ylim=c(190,450))
I'd like to add the labels of the classes.
I can do:
hcd4 <- cut(hcd,h=200)$upper
plot(hcd4)
Besides the fact labels are oddly shifted, does the numbering
of the branches from cut() always correspond to the classes in hcut4?
In this case, they do:
hcd4cut <- cutree(hcd4, h=200)
hcd4cut
But is this the general case?
The example using dendextend (Label and color leaf dendrogram in r) is nice
library(dendextend)
colorCodes <- c("red","green","blue","cyan")
labels_colors(hcd) <- colorCodes[hcut4][order.dendrogram(hcd)]
plot(hcd)
Unfortunately, I always have many individuals, so plotting individuals is rarely a useful option for me.
I can do:
hcd <- as.dendrogram((hc))
hcd4 <- cut(hcd,h=200)$upper
and I can add colors
hcd4cut <- cutree(hcd4, h=200)
labels_colors(hcd4) <- colorCodes[hcd4cut][order.dendrogram(hcd4)]
plot(hcd4)
but the following does not work:
plot(hcd4,labels=hcd4cut)
Is there a better way to plot the cut dendrogram labelling branches
according to the classes (consistent with the result of cutree())?
This is an example of what I would need (class labels edited on the picture),
but note that the problem is that I do not know if the labels are actually at the right branch:
Is there any R function to retrieve the branch lengths of a dendrogram:
set.seed(1)
mat <- matrix(rnorm(100*10),nrow=100,ncol=10)
dend <- as.dendrogram(hclust(dist(t(mat))))
in a breadth-first-search order?
For dend I'd like to get this result:
c(16.38688,15.41441,15.99504,14.68365,13.52949,14.39275,12.96921,13.91157,13.15395)
which is node depths (excluding leaves) ordered by bps.
Thanks
You can easily code one like this:
dendro_depth <- function(dendro){
if(!is.null(attributes(dendro)$leaf))
0
else
max(dendro_depth(dendro[[1]]),dendro_depth(dendro[[2]])) +1
}
See get_branches_heights from dendextend.
set.seed(1)
mat <- matrix(rnorm(100*10),nrow=100,ncol=10)
dend <- as.dendrogram(hclust(dist(t(mat))))
library(dendextend)
get_branches_heights(dend, sort = F)
It does not seem to be exactly in the order youu want, but see if this is still useful:
> get_branches_heights(dend, sort = F)
[1] 16.38688 15.41441 14.68365 15.99504 13.52949
[6] 12.96921 14.39275 13.91157 13.15395
BTW, the recent github version of dendextend also comes with the highlight_branches function for coloring branches based on branch height (in case this is somehow related to your motivation):
plot(highlight_branches(dend))
The data:
set.seed(1)
mat <- matrix(rnorm(100*10),nrow=100,ncol=10)
dend <- as.dendrogram(hclust(dist(t(mat))))
Using the data.tree package allows traversing trees in various orders. level will give what the question specifies:
require(data.tree)
dend.dt <- as.Node(dend)
sapply(Traverse(dend.dt,traversal = "level", pruneFun = isNotLeaf),function(x) x$plotHeight)
[1] 16.38688 15.41441 15.99504 14.68365 13.52949 14.39275 12.96921 13.91157 13.15395
I have clustered some data in r and plotted the results as a dendrogram. What i am trying to find out right now is how I can change the colour of the labels, so that labels that are the same have the same colour.
I got my dendrogram using the following code:
> d<-stringdist::stringdistmatrix(AR_GenesforR$AR_Genes)
> cl <-hclust(as.dist(d))
> plot(cl, label=AR_GenesforR$AR_Genes)
> groups <- cutree(cl, k=2)
> rect.hclust(cl, k=2, border="red")
The resulting dendrogram looks like this:
What I want to do now, is to colour all labels that are the same in the same colour, eg. all 2010 in yellow, all 2011 in blue and so on. I have researched quite a bit, but mostly only found ways to colour the labels according to the clusters they are in. Does someone know how I can do what I want?
Here is a function that will do what you ask, based on the dendextend R package (here is a short 2 page paper on the package).
x <- c(2011,2011,2012,2012,2015,2015,2015)
names(x) <- x
dend <- as.dendrogram(hclust(dist(x)))
color_unique_labels <- function(dend, ...) {
if(!require(dendextend)) install.packages("dendextend")
if(!require(colorspace)) install.packages("colorspace")
library("dendextend")
n_unique_labels <- length(unique(labels(dend)))
colors <- colorspace::rainbow_hcl(n_unique_labels)
labels_number <- as.numeric(factor(labels(dend)))
labels_colors(dend) <- colors[labels_number]
dend
}
par(mfrow = c(1,2))
plot(dend)
dend2 <- color_unique_labels(dend)
plot(dend2)