I am plotting a dendrogram of the moduleeigengenes in the WGCNA package and I want to order/swap the branches. I use the plotEigengeneNetworks function to plot it, but cannot define the order of the branches. I know that there is the dendextend package for modifying dendrograms, but this does not work on the output that plotEigengeneNetworks function produces. I would be helpful for any suggestions on how to achieve this.
Example:
library(WGCNA)
set.seed(123)
ME <- data.frame(replicate(15, sample(1:10, 11, rep=TRUE)))
ME[,c(1:11)] <- sapply(ME[, c(1:11)], as.numeric)
plotEigengeneNetworks(ME, plotAdjacency = TRUE, setLabels = colnames(ME), plotDendrograms = TRUE, plotHeatmaps = FALSE)
Looking at the code of plotEigengeneNetworks, You will not be able to do what you want using it. However, what you can do is reproduce the way it creates the cluster, and then use the dendextend package to directly update the dendrogram (produced from the hclust), by using the following:
# we need to run all of this to get the relevant packages...
source("http://bioconductor.org/biocLite.R")
biocLite("S4Vectors")
biocLite("IRanges")
biocLite("GenomeInfoDb")
biocLite("AnnotationDbi")
biocLite("GO.db")
biocLite("WGCNA")
# what the author wanted:
library(WGCNA)
set.seed(123)
ME <- data.frame(replicate(15, sample(1:10, 11, rep=TRUE)))
ME[,c(1:11)] <- sapply(ME[, c(1:11)], as.numeric)
plotEigengeneNetworks(ME, plotAdjacency = TRUE, setLabels = colnames(ME), plotDendrograms = TRUE, plotHeatmaps = FALSE)
# =================================
# Reproduce the above plot:
corME = cor(ME)
disME = as.dist(1 - corME)
clust = fastcluster::hclust(disME, method = "average") # you could also use stats::hclust just as well...
plot(clust)
# Now that we got what we wanted, let's move to dendrogram land
dend <- as.dendrogram(clust)
# get dendextend
if(!require(dendextend)) install.packages("dendextend")
library(dendextend)
dend <- hang.dendrogram(dend)
# plot(dend) # it now looks similar to the hclust plot
# we can now rotate the labels:
dend <- color_labels(dend)
dend2 <- rotate(dend, order = sort(labels(dend)))
par(mfrow = c(1,2))
plot(dend, main = "Original dend plot")
plot(dend2, main = "Dend plot after rotating the labels")
#
Result:
Related
I was using Seurat to analyse single cell RNA-seq data and I managed to draw a heatmap plot with DoHeatmap() after clustering and marker selection, but got a bunch of random characters appearing in the legend. They are random characters as they will change every time you run the code. I was worrying over it's something related to my own dataset, so I then tried the test Seurat object 'ifnb' but still got the same issue (see the red oval in the example plot).
example plot
I also tried importing the Seurat object in R in the terminal (via readRDS) and ran the plotting function, but got the same issue there, so it's not a Rstudio thing.
Here are the codes I ran:
'''
library(Seurat)
library(SeuratData)
library(patchwork)
InstallData("ifnb")
LoadData("ifnb")
ifnb.list <- SplitObject(ifnb, split.by = "stim")
ifnb.list <- lapply(X = ifnb.list, FUN = function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})
features <- SelectIntegrationFeatures(object.list = ifnb.list)
immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list, anchor.features = features)
immune.combined <- IntegrateData(anchorset = immune.anchors)
immune.combined <- ScaleData(immune.combined, verbose = FALSE)
immune.combined <- RunPCA(immune.combined, npcs = 30, verbose = FALSE)
immune.combined <- RunUMAP(immune.combined, reduction = "pca", dims = 1:30)
immune.combined <- FindNeighbors(immune.combined, reduction = "pca", dims = 1:30)
immune.combined <- FindClusters(immune.combined, resolution = 0.5)
DefaultAssay(immune.combined) <- 'RNA'
immune_markers <- FindAllMarkers(immune.combined, latent.vars = "stim", test.use = "MAST", assay = 'RNA')
immune_markers %>%
group_by(cluster) %>%
top_n(n = 10, wt = avg_log2FC) -> top10_immune
DoHeatmap(immune.combined, slot = 'data',features = top10_immune$gene, group.by = 'stim', assay = 'RNA')
'''
Does anyone have any idea how to solve this issue other than reinstalling everything?
I have been having the same issue myself and while I have solved it by not needing the legend, I think you could use this approach and use a similar solution:
DoHeatmap(immune.combined, slot = 'data',features = top10_immune$gene, group.by = 'stim', assay = 'RNA') +
scale_color_manual(
values = my_colors,
limits = c('CTRL', 'STIM'))
Let me know if this works! It doesn't solve the source of the odd text values but it does the job! If you haven't already, I would recommend creating a forum question on the Seurat forums to see where these characters are coming from!
When I use seurat4.0, I met the same problem.
While I loaded 4.1, it disappeared
I am trying to apply Hierarchical Clustering for Time Series in order to identify the states with similar behaviors in the time series for residential_percent_change_from_baseline. I get the dendrogram but the index i get in the x axis are just numbers and I want the states names.
my data looks like this:
Data
And this is some part of my code
data <- dataset
#Convert to factor
cols <- c("country_region_code", "country_region", "sub_region_1", "iso_3166_2_code")
data[cols] <- lapply(data[cols], factor)
sapply(data, class)
data$date <- as.Date(data$date)
summary(data)
#Data preparation
n <- 10
s <- sample(1:100, n)
i <- c(s,0+s, 279+s, 556+s, 833+s, 1110+s, 1387+s, 1664+s, 1941+s, 2218+s, 2495+s, 2772+s, 3049+s, 3326+s, 3603+s, 3880+s, 4157+s, 4434+s, 4711+s, 4988+s, 5265+s, 5542+s, 5819+s, 6096+s, 6373+s, 6650+s, 6927+s, 7204+s, 7481+s, 7758+s, 8035+s, 8312+s, 8589+s, 8866+s)
d <- data[i,3:4]
d$residential <- data[i,11]
d[,2] =NULL
str(d)
pattern <- c(rep('Mexico', n),
rep('Aguascalientes', n),
rep('Baja California',n),
rep('Baja California Sur',n),
rep('Campeche',n),
rep('Coahuila',n),
rep('Colima',n),
rep('Chiapas',n),
rep('Chihuahua',n),
rep('Durango',n),
rep('Guanajuato',n),
rep('Guerrero',n),
rep('Hidalgo',n),
rep('Jalisco',n),
rep('México City',n),
rep('Michoacan',n),
rep('Morelos',n),
rep('Nayarit',n),
rep('Nuevo León',n),
rep('Oaxaca',n),
rep('Puebla',n),
rep('Querétaro',n),
rep('Quintana Roo',n),
rep('San Luis Potosí',n),
rep('Sinaloa',n),
rep('Sonora',n),
rep('Tabasco',n),
rep('Tamaulipas',n),
rep('Tlaxcala',n),
rep('Veracruz',n),
rep('Yucatán',n),
rep('Zacatecas.',n))
d <- data.matrix(d)
distance <- dist(d, method = 'euclidean')
hc <- hclust(distance, method="ward.D")
plot(hc, cex=.7, hang = -1, col='blue', labels=pattern)
I get this dendrogram when I don't specify labels
dendrogram with numeric labels
But when I do I get this error
Error in graphics:::plotHclust(n1, merge, height, order(x$order), hang, : invalid dendrogram input
I hope somebody can help me, I am little bit tired of this
Maybe it will work with an alternative to the base r plot function. Try ggdendroplot. It should display the labels on the axis. You will need ggplot2 for this.
devtools::install("nicolash2/ggdendroplot")
library(ggdendroplot)
library(ggplot2)
ggplot() + geom_dendro(hc)
If you want to modify it (turn it, color it, etc.) check out the github page: https://github.com/NicolasH2/ggdendroplot
I'm trying to use gridGraphics::grid.echo according to this link so I can combine a Gviz plotTracks plot with a ggplot using cowplot's plot_grid.
Following Gviz's vignette and this link, here's what I do:
require(Gviz)
data(geneModels)
gtrack <- GenomeAxisTrack()
itrack <- IdeogramTrack(genome = "hg19", chromosome = as.character(geneModels$chromosome[1]))
grtrack <- GeneRegionTrack(geneModels, genome = "hg19",chromosome = as.character(geneModels$chromosome[1]), name = "Gene Model")
require(gridGraphics)
gwrap_plot <- function(x) {
tree <- grid::grid.grabExpr(gridGraphics::grid.echo(x))
u <- grid::unit(1, 'null')
gtable::gtable_col(NULL, list(tree), u, u)
}
graphics.off()
plotTracks(list(itrack, gtrack, grtrack))
track.plot <- recordPlot()
gwrap_plot(track.plot)
And I get this error:
Error in switch(x[[2]][[1]]$name, C_abline = C_abline(x[[2]]), C_plot_new = C_plot_new(x[[2]]), :
EXPR must be a length 1 vector
Any idea what's the problem of gridGraphics's grid.echo with Gviz's plotTracks plot?
grid.echo is for base graphics, Gviz appears to use grid graphics,
p1 = grid::grid.grabExpr(plotTracks(list(itrack, gtrack, grtrack), add = TRUE))
p2 = ggplot2::qplot(1:10, 1:10)
gridExtra::grid.arrange(p1, p2, ncol=2)
I am using biwavelet package to conduct wavelet coherence analysis. When I want to set my own x ticklabel, I find axisis not working. The following gives a reproducible example. Thanks.
require(biwavelet)
t1 <- cbind(1:100, rnorm(100))
t2 <- cbind(1:100, rnorm(100))
wtc.t1t2 <- wtc(t1,t2,nrands = 10)
plot(wtc.t1t2, plot.cb = TRUE, plot.phase = TRUE,xaxt='n')
axis(1,at = seq(10,100,10),labels = seq(1,10,1))
The thing that was breaking your plot was plot.cb = TRUE.
In the source code for plot.biwavelet the author notes the following about the plot.cb option:
## Add color bar: this must happen after everything, otherwise chaos
ensues!
So that was the problem -- you invoked axis() after plot.cb and chaos ensued. However, you can manually add back the color bar using image.plot from the fields package, after having run plot without plot.cb then having added your axis().
pacman::p_load(biwavelet,fields)
t1 <- cbind(1:100, rnorm(100))
t2 <- cbind(1:100, rnorm(100))
wtc.t1t2 <- wtc(t1,t2,nrands = 10)
plot(wtc.t1t2, plot.phase = TRUE,xaxt='n')
axis(1,at = seq(10,100,10),labels = seq(1,20,2))
image.plot( zlim=c(0,25), legend.only=TRUE)
You can customize the ticks and the color bar to your liking this way!
I want to create a heatmap using the heatmap.2 function from the gplots package. This is a minimal example.
require(gplots)
# create symmetric matrix
x = matrix(rnorm(100), nrow=10)
diag(x) <- 1
x[upper.tri(x)] <- t(x)[upper.tri(x)]
colnames(x) <- rownames(x) <- letters[1:nrow(x)]
# create side colours
varcols = setNames(rainbow(nrow(x)), rownames(x))
# create heatmap
heatmap.2(x,
symm = TRUE,
trace = "none",
revC=TRUE, # <-- THIS IS THE PROBLEM
ColSideColors = varcols,
RowSideColors = varcols
)
The problem are the sidecolors. x is a symmetric matrix, thus columns and rows should have the same sidecolors. This is fine as long as revC = FALSE. However, when I use revC = TRUE the order of the colors is messed up. Sometimes - in small examples - it helps to reverse the ColSideColors, but that doesn't always work.
Am I doing anything wrong or is this a gplots bug?
For anyone else who comes across this problem this is how I solved it:
thing = heatmap.2(my_matrix,...RowSideColors=row_cols, revC=F)
ordinary_order = thing$rowInd
reversal = cbind(ordinary_order, rev(ordinary_order))
rev_col = row_cols[reversal[,2]]; rev_col = rev_col[order(reversal[,1])];
heatmap.2(my_matrix, RowSideColors=rev_col, revC=T)