R: Centering heatmap.2 key (gplots package) - r

I would like to create a heatmap via the heatmap.2() command with a color key that is centered on 0 (i.e. white color -> 0, red -> greater than 0, blue -> less than 0) while keeping scale="none" as I am interested in plotting a heatmap of the actual values. However, all of my heatmaps are not centered on zero upon using the following line:
library(gplots)
outputHeatmap <- heatmap.2(heatmapInputActual, dendrogram="none", Rowv=FALSE,
Colv=FALSE, col= bluered(256), scale="none", key=TRUE, density.info="none",
trace="none", cexRow=0.125, cexCol=0.125, symm=FALSE, symkey=TRUE)
I thought that using the command symkey=TRUE would work, but it does not. The variable I am trying make a heatmap of is an (n x 3) matrix of numerical values. A problematic input to the heatmap.2() command described above follows:
8.408458 5.661144 0.00000000
4.620846 4.932283 -0.46570468
-4.638912 -3.471838 -0.12146109
-4.822829 -3.946024 0.06403327
3.948832 4.520447 -0.31945941
Thank you for your time. I look forward to your replies.

The solution seem to be just adding symbreaks to your heatmap.2. Here is a fully reproducible example with your data:
library(gplots)
#read your example data
heatmapInputActual <- read.table(textConnection(
"8.408458 5.661144 0.00000000
4.620846 4.932283 -0.46570468
-4.638912 -3.471838 -0.12146109
-4.822829 -3.946024 0.06403327
3.948832 4.520447 -0.31945941
"),as.is=TRUE)
#convert sample data to matrix
heatmapInputActual <- as.matrix(heatmapInputActual)
#just add symbreaks to the end of your code
heatmap.2(heatmapInputActual, dendrogram="none", Rowv=FALSE, Colv=FALSE,
col = bluered(256), scale="none", key=TRUE, density.info="none",
trace="none", cexRow=0.125, cexCol=0.125, symm=F,symkey=T,symbreaks=T)

Related

(R) - Colorkey of Heatmap.2 is taking too long to load in PDF

I am trying to create a heatmap for a data set that contains 4 columns and 600 rows. I am using the heatmap.2 function and creating the file using the pdf() function. If I create the heatmap without the Color Key it opens quickly; however, if I add the Color Key it takes a while to fully load it and I don't know why. It doesn't necessarily impact the final image, but if I try to load it into Adobe Illustrator it crashes.
The file with and without the color key is 436 KB and 70 KB, respectively. As I am using color breaks I decided to lower the number to see if it would help, but no success. I also tried to lower the size of the color key using the keysize argument, but also no success.
My code is the following:
data <- read.csv("C:/Users/proteomics.csv")
rnames <- data[,1]
mat_data <- data.matrix(data[,2:ncol(data)])
row.names(mat_data) <- rnames
my_palette <- colorRampPalette(c("blue", "white", "red"))(n = 299)
col_breaks = c(seq(min(mat_data),-0.833,length=100), # for Blue
seq(-0.832,0.794,length=100), # for White
seq(0.795,max(mat_data),length=100)) # for Red
pdf("C:/Users/file.pdf",
paper = "letter", height = 10, useDingbats=FALSE)
heatmap.2(mat_data,
#cellnote = mat_data, # same data set for cell labels
main = "Proteomics", # heat map title
notecol="black", # change font color of cell labels to black
density.info="none", # turns off density plot inside color legend
trace="none", # turns off trace lines inside the heat map
margins =c(4,3), # widens margins around plot
col=my_palette, # use on color palette defined earlier
breaks=col_breaks, # enable color transition at specified limits
dendrogram="none", # only draw a row dendrogram
#scale = "row",
labRow = FALSE,
Rowv = FALSE, # Row clustering
srtCol=45, # Column text angle
key=T, cexRow = 0.2, cexCol=1,
lhei=c(3,22), lwid=c(3,8),
keysize=0.25,
#key.par = list(cex=0.1),
Colv="NA") # turn off column clustering # turn off column clustering
dev.off()
It feels that the color key is adding way to much information in the final file. Is there anyway to reduce it? In theory it should open in pdf quicker and allow me to load it into Illustrator.

R pheatmap scale different than scale before pheatmap

The heatmap when scaling before plotting:
mat_scaled <- scale(t(mat))
pheatmap(t(mat_scaled), show_rownames=F, show_colnames=F,
border_color=F, color=colorRampPalette(brewer.pal(6,name="PuOr"))(12))
with the scale going from [-2, 6] is completely different than when using the scaling within the pheatmap function
pheatmap(t(mat_scaled), scale="row", show_rownames=F,
show_colnames=F, border_color=F, color=colorRampPalette(brewer.pal(6,name="PuOr"))(12))
where the scale is set from [-6,6].
Why is this difference and how could I obtain the matrix represented in the second figure?
In the second figure you plot the heatmap of the scaled matrix mat_scaled scaled a second time using the option scale="row" of pheatmap.
This is not the right way to compare external and internal scaling.
Here is the solution:
library(gridExtra)
library(pheatmap)
library(RColorBrewer)
cols <- colorRampPalette(brewer.pal(6,name="PuOr"))(12)
brks <- seq(-3,3,length.out=12)
data(attitude)
mat <- as.matrix(attitude)
# Scale by row
mat_scaled <- t(scale(t(mat)))
p1 <- pheatmap(mat_scaled, show_rownames=F, show_colnames=F,
breaks=brks, border_color=F, color=cols)
p2 <- pheatmap(mat, scale="row", show_rownames=F, show_colnames=F,
breaks=brks, border_color=F, color=cols)
grid.arrange(grobs=list(p1$gtable, p2$gtable))

Heatmap for a large matrix and get the clear labels

How to draw a heatmap from a matrix of 305 columns and 865 rows in R.
The code I have written for the matrix is
nba <- read.csv("mydata.csv", sep=",")
row.names(nba) <- nba[,1]
nba <- nba[,2:865]
nba_matrix <- data.matrix(nba)
nba_heatmap <- heatmap(nba_matrix, Rowv=NA, Colv=NA, col = brewer.pal(9, "Blues"), scale="column", margins=c(5,10))
Now the code gives me the heatmap as shown bellow, but the labels are not clear. Please help me to get a clear heatmap.
since you stated that you need all the labels, the only way I see is reducing the font size. You can do this by setting the cexCol and cexRow parameters in your call to heatmap(); for example like this:
heatmap(as.matrix(iris[,1:3]),cexRow = 0.1, cexCol = 0.1,)

R draw heatmap with clusters, but hide dendrogram

By default, R's heatmap will cluster rows and columns:
mtscaled = as.matrix(scale(mtcars))
heatmap(mtscaled, scale='none')
I can disable the clustering:
heatmap(mtscaled, Colv=NA, Rowv=NA, scale='none')
And then the dendrogram goes away:
But now the data is not clustered anymore.
I don't want the dendrograms to be shown, but I still want the rows and/or columns to be clustered. How can I do this?
Example of what I want:
You can do this with pheatmap:
mtscaled <- as.matrix(scale(mtcars))
pheatmap::pheatmap(mtscaled, treeheight_row = 0, treeheight_col = 0)
See pheatmap output here:
library(gplots)
heatmap.2(mtscaled,dendrogram='none', Rowv=TRUE, Colv=TRUE,trace='none')
Rowv -is TRUE, which implies dendrogram is computed and reordered based on row means.
Colv - columns should be treated identically to the rows.
I had similar issue with pheatmap, which has better visualisation and heatmap or heatmap.2. Though heatmap.2 is a choice for your solution, Here is the solution with pheatmap, by extracting the order of clustered data.
library(pheatmap)
mtscaled = as.matrix(scale(mtcars))
H = pheatmap(mtscaled)
Here is the output of pheatmap
pheatmap(mtscaled[H$tree_row$order,H$tree_col$order],cluster_rows = F,cluster_cols = F)
Here is the output of pheatmap after extracting the order of clusters
For ComplexHeatmap, there are function parameters to remove the dendrograms:
library(ComplexHeatmap)
Heatmap(as.matrix(iris[,1:4]), name = "mat", show_column_dend = FALSE, show_row_dend = FALSE)
You can rely on base R structures and consider following approach based on building the hclust trees by yourself.
mtscaled = as.matrix(scale(mtcars))
row_order = hclust(dist(mtscaled))$order
column_order = hclust(dist(t(mtscaled)))$order
heatmap(mtscaled[row_order,column_order], Colv=NA, Rowv=NA, scale="none")
No need to install additional junk.
Do the dendrogram twice using the basic R heatmap function. Take the output of the first run, which clusters but has mandatory drawing of the dendrogram and feed it into the heatmap function again. This time, without clustering, and without drawing the dendrogram.
#generate a random symmetrical matrix with a little bit of structure, and make a heatmap
M100s<-matrix(runif(10000),nrow=100)
M100s[2,]<-runif(100,min=0.1,max=0.2)
M100s[4,]<-runif(100,min=0.1,max=0.2)
M100s[6,]<-runif(100,min=0.1,max=0.2)
M100s[99,]<-runif(100,min=0.1,max=0.2)
M100s[37,]<-runif(100,min=0.1,max=0.2)
M100s[lower.tri(M100s)] <- t(M100s)[lower.tri(M100s)]
heatmap(M100s)
#save the output
OutputH <- heatmap(M100s)
#run it again without clustering or the dendrogram
M100c <- M100s
M100c1 <- M100c[,OutputH$rowInd]
M100c2 <- M100c1[OutputH$colInd,]
heatmap(M100c2,Rowv = NA, Colv = NA, labRow = NA, labCol = NA)

Trying to determine why my heatmap made using heatmap.2 and using breaks in R is not symmetrical

I am trying to cluster a protein dna interaction dataset, and draw a heatmap using heatmap.2 from the R package gplots. My matrix is symmetrical.
Here is a copy of the data-set I am using after it is run through pearson:DataSet
Here is the complete process that I am following to generate these graphs: Generate a distance matrix using some correlation in my case pearson, then take that matrix and pass it to R and run the following code on it:
library(RColorBrewer);
library(gplots);
library(MASS);
args <- commandArgs(TRUE);
matrix_a <- read.table(args[1], sep='\t', header=T, row.names=1);
mtscaled <- as.matrix(scale(matrix_a))
# location <- args[2];
# setwd(args[2]);
pdf("result.pdf", pointsize = 15, width = 18, height = 18)
mycol <- c("blue","white","red")
my.breaks <- c(seq(-5, -.6, length.out=6),seq(-.5999999, .1, length.out=4),seq(.100009,5, length.out=7))
#colors <- colorpanel(75,"midnightblue","mediumseagreen","yellow")
result <- heatmap.2(mtscaled, Rowv=T, scale='none', dendrogram="row", symm = T, col=bluered(16), breaks=my.breaks)
dev.off()
The issue I am having is once I use breaks to help me control the color separation the heatmap no longer looks symmetrical.
Here is the heatmap before I use breaks, as you can see the heatmap looks symmetrical:
Here is the heatmap when breaks are used:
I have played with the cutoff's for the sequences to make sure for instance one sequence does not end exactly where the other begins, but I am not able to solve this problem. I would like to use the breaks to help bring out the clusters more.
Here is an example of what it should look like, this image was made using cluster maker:
I don't expect it to look identical to that, but I would like it if my heatmap is more symmetrical and I had better definition in terms of the clusters. The image was created using the same data.
After some investigating I noticed was that after running my matrix through heatmap, or heatmap.2 the values were changing, for example the interaction taken from the provided data set of
Pacdh-2
and
pegg-2
gave a value of 0.0250313 before the matrix was sent to heatmap.
After that I looked at the matrix values using result$carpet and the values were then
-0.224333135
-1.09805379
for the two interactions
So then I decided to reorder the original matrix based on the dendrogram from the clustered matrix so that I was sure that the values would be the same. I used the following stack overflow question for help:
Order of rows in heatmap?
Here is the code used for that:
rowInd <- rev(order.dendrogram(result$rowDendrogram))
colInd <- rowInd
data_ordered <- matrix_a[rowInd, colInd]
I then used another program "matrix2png" to draw the heatmap:
I still have to play around with the colors but at least now the heatmap is symmetrical and clustered.
Looking into it even more the issue seems to be that I was running scale(matrix_a) when I change my code to just be mtscaled <- as.matrix(matrix_a) the result now looks symmetrical.
I'm certainly not the person to attempt reproducing and testing this from that strange data object without code that would read it properly, but here's an idea:
..., col=bluered(20)[4:20], ...
Here's another though which should return the full rand of red which tha above strategy would not:
shift.BR<- colorRamp(c("blue","white", "red"), bias=0.5 )((1:16)/16)
heatmap.2( ...., col=rgb(shift.BR, maxColorValue=255), .... )
Or you can use this vector:
> rgb(shift.BR, maxColorValue=255)
[1] "#1616FF" "#2D2DFF" "#4343FF" "#5A5AFF" "#7070FF" "#8787FF" "#9D9DFF" "#B4B4FF" "#CACAFF" "#E1E1FF" "#F7F7FF"
[12] "#FFD9D9" "#FFA3A3" "#FF6C6C" "#FF3636" "#FF0000"
There was a somewhat similar question (also today) that was asking for a blue to red solution for a set of values from -1 to 3 with white at the center. This it the code and output for that question:
test <- seq(-1,3, len=20)
shift.BR <- colorRamp(c("blue","white", "red"), bias=2)((1:20)/20)
tpal <- rgb(shift.BR, maxColorValue=255)
barplot(test,col = tpal)
(But that would seem to be the wrong direction for the bias in your situation.)

Resources