How to draw a heatmap from a matrix of 305 columns and 865 rows in R.
The code I have written for the matrix is
nba <- read.csv("mydata.csv", sep=",")
row.names(nba) <- nba[,1]
nba <- nba[,2:865]
nba_matrix <- data.matrix(nba)
nba_heatmap <- heatmap(nba_matrix, Rowv=NA, Colv=NA, col = brewer.pal(9, "Blues"), scale="column", margins=c(5,10))
Now the code gives me the heatmap as shown bellow, but the labels are not clear. Please help me to get a clear heatmap.
since you stated that you need all the labels, the only way I see is reducing the font size. You can do this by setting the cexCol and cexRow parameters in your call to heatmap(); for example like this:
heatmap(as.matrix(iris[,1:3]),cexRow = 0.1, cexCol = 0.1,)
Related
I am using heatmaply to create a heatmap but unfortunately my row labels are not showing as names. Row labels are row numbers. How can I make my row names from column 1 show on the heatmap ?
Here is my code:
the row names are still not showing as labels, but rather as numbers (1,2,3...):
heatmaply(mtcars, k_col =10, k_row =1, row.names(mtcars) <- mtcars[,1], cexRow = 0.1, cexCol=10, margins =c(100,100))
Please advise
Thanks
I don't know what your heatmap should look like (I could not generate a map with the code you provided). However, I was able to generate a graph with the row labels (Mazda RX4, etc.) by simply removing row.names from your code:
heatmaply(mtcars, k_col =10, k_row =1, cexRow = 0.1, cexCol=10, margins =c(100,100))
Well the secret is using labRow= cars[,1] in the code as follows:
heatmaply(cars, k_col=14, k_row=1,labRow= cars[,1], cexRow=10, cexCol=10, margins=c(50,50), scale_fill_gradient_fun=ggplot2::scale_fill_gradient2(low="navy blue", high ="red",midpoint=1, limits=c(0,2.2)))
Suppose I have this very simple 2x2 RGB datacube that I want to plot:
set.seed(2017)
dc <- array(runif(12), dim = c(2,2,3))
I can plot this just by rasterizing the datacube:
plot(as.raster(dc), interpolate = FALSE)
But I would like to plot this data cube with the lattice package (for uniformity sake since I am mainly using it for other plotting too).
Is this possible? I am looking at levelplot, but am not able to make it work.
The problem you have is that lattice needs a matrix, that is a numeric matrix, and rasters of RGB become a factor matrix:
r <-as.raster(dc)
r
gives this result:
[,1] [,2]
[1,] "#ECC478" "#780AAC"
[2,] "#89C546" "#4A6F01"
to use it as lattice you need to transform this into a numeric matrix, this looks long but it seems is the only way to ensure to keep the order:
m <- matrix(as.numeric(as.factor(as.vector(as.matrix(r)))), ncol= 2)
levelplot(m, panel = panel.levelplot.raster)
The problem you will get here is that you won't keep the same RGB colors, but it's a lattice solution.
Ok, this turned out to be quite an endeavor with levelplot.
I convert the RGB hex color values from raster to integers, and then use these values to map them to the color palette of the raster.
set.seed(2017)
dc <- array(runif(12), dim = c(2,2,3))
plot(as.raster(dc), interpolate = FALSE)
# convert color hexadecimals to integers
rgbInt <- apply(as.raster(dc), MARGIN = c(1,2),
FUN = function(str){strtoi(substring(str, 2), base = 16)})
rgbIntUnq <- unique(as.vector(rgbInt))
lattice::levelplot(x = t(rgbInt), # turn so rows become columns
at = c(0,rgbIntUnq),
col.regions = sprintf("#%06X", rgbIntUnq[order(rgbIntUnq)]), # to hex
ylim = c(nrow(rgbInt) + 0.5, 1 - 0.5), # plot from top to bottom
xlab = '', ylab = '')
The legend can also be removed with colorkey = FALSE property.
I wonder whether there are simpler ways to do the same.
By default, R's heatmap will cluster rows and columns:
mtscaled = as.matrix(scale(mtcars))
heatmap(mtscaled, scale='none')
I can disable the clustering:
heatmap(mtscaled, Colv=NA, Rowv=NA, scale='none')
And then the dendrogram goes away:
But now the data is not clustered anymore.
I don't want the dendrograms to be shown, but I still want the rows and/or columns to be clustered. How can I do this?
Example of what I want:
You can do this with pheatmap:
mtscaled <- as.matrix(scale(mtcars))
pheatmap::pheatmap(mtscaled, treeheight_row = 0, treeheight_col = 0)
See pheatmap output here:
library(gplots)
heatmap.2(mtscaled,dendrogram='none', Rowv=TRUE, Colv=TRUE,trace='none')
Rowv -is TRUE, which implies dendrogram is computed and reordered based on row means.
Colv - columns should be treated identically to the rows.
I had similar issue with pheatmap, which has better visualisation and heatmap or heatmap.2. Though heatmap.2 is a choice for your solution, Here is the solution with pheatmap, by extracting the order of clustered data.
library(pheatmap)
mtscaled = as.matrix(scale(mtcars))
H = pheatmap(mtscaled)
Here is the output of pheatmap
pheatmap(mtscaled[H$tree_row$order,H$tree_col$order],cluster_rows = F,cluster_cols = F)
Here is the output of pheatmap after extracting the order of clusters
For ComplexHeatmap, there are function parameters to remove the dendrograms:
library(ComplexHeatmap)
Heatmap(as.matrix(iris[,1:4]), name = "mat", show_column_dend = FALSE, show_row_dend = FALSE)
You can rely on base R structures and consider following approach based on building the hclust trees by yourself.
mtscaled = as.matrix(scale(mtcars))
row_order = hclust(dist(mtscaled))$order
column_order = hclust(dist(t(mtscaled)))$order
heatmap(mtscaled[row_order,column_order], Colv=NA, Rowv=NA, scale="none")
No need to install additional junk.
Do the dendrogram twice using the basic R heatmap function. Take the output of the first run, which clusters but has mandatory drawing of the dendrogram and feed it into the heatmap function again. This time, without clustering, and without drawing the dendrogram.
#generate a random symmetrical matrix with a little bit of structure, and make a heatmap
M100s<-matrix(runif(10000),nrow=100)
M100s[2,]<-runif(100,min=0.1,max=0.2)
M100s[4,]<-runif(100,min=0.1,max=0.2)
M100s[6,]<-runif(100,min=0.1,max=0.2)
M100s[99,]<-runif(100,min=0.1,max=0.2)
M100s[37,]<-runif(100,min=0.1,max=0.2)
M100s[lower.tri(M100s)] <- t(M100s)[lower.tri(M100s)]
heatmap(M100s)
#save the output
OutputH <- heatmap(M100s)
#run it again without clustering or the dendrogram
M100c <- M100s
M100c1 <- M100c[,OutputH$rowInd]
M100c2 <- M100c1[OutputH$colInd,]
heatmap(M100c2,Rowv = NA, Colv = NA, labRow = NA, labCol = NA)
I can not figure out how the lattice levelplot works. I have played with this now for some time, but could not find reasonable solution.
Sample data:
Data <- data.frame(x=seq(0,20,1),y=runif(21,0,1))
Data.mat <- data.matrix(Data)
Plot with levelplot:
rgb.palette <- colorRampPalette(c("darkgreen","yellow", "red"), space = "rgb")
levelplot(Data.mat, main="", xlab="Time", ylab="", col.regions=rgb.palette(100),
cuts=100, at=seq(0,1,0.1), ylim=c(0,2), scales=list(y=list(at=NULL)))
This is the outcome:
Since, I do not understand how this levelplot really works, I can not make it work. What I would like to have is the colour strips to fill the whole window of the corresponding x (Time).
Alternative solution with other method.
Basically, I'm trying here to plot the increasing risk over time, where the red is the highest risk = 1. I would like to visualize the sequence of possible increase or clustering risk over time.
From ?levelplot we're told that if the first argument is a matrix then "'x' provides the
'z' vector described above, while its rows and columns are
interpreted as the 'x' and 'y' vectors respectively.", so
> m = Data.mat[, 2, drop=FALSE]
> dim(m)
[1] 21 1
> levelplot(m)
plots a levelplot with 21 columns and 1 row, where the levels are determined by the values in m. The formula interface might look like
> df <- data.frame(x=1, y=1:21, z=runif(21))
> levelplot(z ~ y + x, df)
(these approaches do not quite result in the same image).
Unfortunately I don't know much about lattice, but I noted your "Alternative solution with other method", so may I suggest another possibility:
library(plotrix)
color2D.matplot(t(Data[ , 2]), show.legend = TRUE, extremes = c("yellow", "red"))
Heaps of things to do to make it prettier. Still, a start. Of course it is important to consider the breaks in your time variable. In this very simple attempt, regular intervals are implicitly assumed, which happens to be the case in your example.
Update
Following the advice in the 'Details' section in ?color2D.matplot: "The user will have to adjust the plot device dimensions to get regular squares or hexagons, especially when the matrix is not square". Well, well, quite ugly solution.
par(mar = c(5.1, 4.1, 0, 2.1))
windows(width = 10, height = 2.5)
color2D.matplot(t(Data[ , 2]),
show.legend = TRUE,
axes = TRUE,
xlab = "",
ylab = "",
extremes = c("yellow", "red"))
I would like to create a heatmap via the heatmap.2() command with a color key that is centered on 0 (i.e. white color -> 0, red -> greater than 0, blue -> less than 0) while keeping scale="none" as I am interested in plotting a heatmap of the actual values. However, all of my heatmaps are not centered on zero upon using the following line:
library(gplots)
outputHeatmap <- heatmap.2(heatmapInputActual, dendrogram="none", Rowv=FALSE,
Colv=FALSE, col= bluered(256), scale="none", key=TRUE, density.info="none",
trace="none", cexRow=0.125, cexCol=0.125, symm=FALSE, symkey=TRUE)
I thought that using the command symkey=TRUE would work, but it does not. The variable I am trying make a heatmap of is an (n x 3) matrix of numerical values. A problematic input to the heatmap.2() command described above follows:
8.408458 5.661144 0.00000000
4.620846 4.932283 -0.46570468
-4.638912 -3.471838 -0.12146109
-4.822829 -3.946024 0.06403327
3.948832 4.520447 -0.31945941
Thank you for your time. I look forward to your replies.
The solution seem to be just adding symbreaks to your heatmap.2. Here is a fully reproducible example with your data:
library(gplots)
#read your example data
heatmapInputActual <- read.table(textConnection(
"8.408458 5.661144 0.00000000
4.620846 4.932283 -0.46570468
-4.638912 -3.471838 -0.12146109
-4.822829 -3.946024 0.06403327
3.948832 4.520447 -0.31945941
"),as.is=TRUE)
#convert sample data to matrix
heatmapInputActual <- as.matrix(heatmapInputActual)
#just add symbreaks to the end of your code
heatmap.2(heatmapInputActual, dendrogram="none", Rowv=FALSE, Colv=FALSE,
col = bluered(256), scale="none", key=TRUE, density.info="none",
trace="none", cexRow=0.125, cexCol=0.125, symm=F,symkey=T,symbreaks=T)