hide the NA values when using display_numbers function in pheatmap - r

I am plotting heatmap by pheatmap package in r.
I applied the display_numbers function to display the values in a matrix into the heatmap, and I got:
heatmap
I got so many NA in my matrix and I would like to hide them in the heatmap, how can I do that?

First off, it is a lot easier for people to help you if you were to provide reproducible and minimal sample data. Please consider reviewing how to provide a minimal reproducible example/attempt for future posts.
As to your question:
Let's generate some sampe data
set.seed(2018)
mat <- matrix(runif(20), 4, 5)
We use a second matrix to display values via the argument display_numbers of pheatmap. Here we simply copy the original matrix and randomly generate some NA values:
mat2 <- mat
mat2[mat2 < 0.5] <- NA
We now replace NA values with empty strings.
mat2[is.na(mat2)] <- ""
Let's show the heatmap
pheatmap(mat, display_numbers = mat2)

Related

How to add label in table() in R

I want to print 2*2 confusion matrix and label it. I am using table() in r.
I want to add predicted and Reality label . Can anybody suggest me , how can I do it?
This is a similar problem to the one in this question. You can follow that approach,
but simplify things a bit by just using a matrix to hold the values, and
just setting its dimension names to "predicted" and "observed":
# create some fake data (2x2, since we're building a confusion matrix)
dat <- matrix(data=runif(n=4, min=0, max=1), nrow=2, ncol=2,
dimnames=list(c("pos", "neg"), c("pos", "neg")))
# now set the names *of the dimensions* (not the row/colnames)
names(dimnames(dat)) <- c("predicted", "observed")
# and we get what we wanted
dat
# output:
# observed
# predicted pos neg
# pos 0.8736425 0.7987779
# neg 0.2402080 0.6388741
Update: #thelatemail made the nice point in the comments that you can specify dimension names when creating tables. The same is true of matrices, except you supply them as names of the dimnames list elements when calling matrix(). So here's an even more compact way:
matrix(data=runif(n=4, min=0, max=1), nrow=2, ncol=2,
dimnames=list(predicted=c("pos", "neg"), observed=c("pos", "neg")))

Highlight Subset Cells From Heatmap By Row/Col Index

I'm trying to visually inspect, and extract, subsets of large heatmaps. For example, I'd like to roughly extract the row/col indices for clusters like the one I circled below:
Following the advice from here, I hope to achieve this by creating rectangles around subsets of cells by index and repeat until I've highlighted areas close enough to what I want.
Using some simpler data, I tried this:
library(gplots)
set.seed(100)
# Input data 4x5 matrix
nx <- 5
ny <- 4
dat <- matrix(runif(20, 1, 10), nrow=ny, ncol=nx)
# Get hierarchically clustered heatmap matrix
hm <- heatmap.2(dat, main="Test HM", key=T, trace="none")
hmat <- dat[rev(hm$rowInd), hm$colInd]
# Logical matrix with the same dimensions as our data
# indicating which cells I want to subset
selection <- matrix(rep(F,20), nrow=4)
# For example: the third row
selection[3,] <- T
#selection <- dat>7 # Typical subsets like this don't work either
# Function for making selection rectangles around selection cells
makeRects <- function(cells){
coords = expand.grid(1:nx,1:ny)[cells,]
xl=coords[,1]-0.49
yb=coords[,2]-0.49
xr=coords[,1]+0.49
yt=coords[,2]+0.49
rect(xl,yb,xr,yt,border="black",lwd=3)
}
# Re-make heatmap with rectangles based on the selection
# Use the already computed heatmap matrix and don't recluster
heatmap.2(hmat, main="Heatmap - Select 3rd Row", key=T, trace="none",
dendrogram="none", Rowv=F, Colv=F,
add.expr={makeRects(selection)})
This does not work. Here is the result. Instead of the third row being highlighted, we see a strange pattern:
It must have to do with this line:
coords = expand.grid(1:nx,1:ny)[cells,]
# with parameters filled...
coords = expand.grid(1:5,1:4)[selection,]
Can anyone explain what's going on here? I'm not sure why my subset isn't working even though it is similar to the one in the other question.
Very close. I think you made a typo in the makeRects() function. In my hands, it works with a few changes.
# Function for making selection rectangles around selection cells
makeRects <- function(cells){
coords = expand.grid(ny:1, 1:nx)[cells,]
xl=coords[,2]-0.49
yb=coords[,1]-0.49
xr=coords[,2]+0.49
yt=coords[,1]+0.49
rect(xl,yb,xr,yt,border="black",lwd=3)
}
# Re-make heatmap with rectangles based on the selection
# Use the already computed heatmap matrix and don't recluster
heatmap.2(hmat, main="Heatmap - Select 3rd Row", key=T, trace="none",
dendrogram="none", Rowv=F, Colv=F,
add.expr={makeRects(selection)})

How can I extract the matrix derived from a heatmap created with gplots after hierarchical clustering?

I am making a heatmap, but I can't assign the result in a variable to check the result before plotting. Rstudio plot it automatically. I would like to get the list of rownames in the order of the heatmap. I'am not sure if this is possible. I'am using this code:
hm <- heatmap.2( assay(vsd)[ topVarGenes, ], scale="row",
trace="none", dendrogram="both",
col = colorRampPalette( rev(brewer.pal(9, "RdBu")) )(255),
ColSideColors = c(Controle="gray", Col1.7G2="darkgreen", JG="blue", Mix="orange")[
colData(vsd)$condition ] )
You can assign the plot to an object. The plot will still be drawn in the plot window, however, you'll also get a list with all the data for each plot element. Then you just need to extract the desired plot elements from the list. For example:
library(gplots)
p = heatmap.2(as.matrix(mtcars), dendrogram="both", scale="row")
p is a list with all the elements of the plot.
p # Outputs all the data in the list; lots of output to the console
str(p) # Struture of p; also lots of output to the console
names(p) # Names of all the list elements
p$rowInd # Ordering of the data rows
p$carpet # The heatmap values
You'll see all the other values associated with the dendrogram and the heatmap if you explore the list elements.
To others out there, a more complete description way to capture a matrix representation of the heatmap created by gplots:
matrix_map <- p$carpet
matrix_map <- t(matrix_map)

Densityplots using colwise - different colors for each line?

I need a plot of different density lines, each in another color. This is an example code (but much smaller), using the built-in data.fame USArrests. I hope it is ok to use it?
colors <- heat.colors(3)
plot(density(USArrests[,2], bw=1, kernel="epanechnikov", na.rm=TRUE),col=colors[1])
lines1E <- function(x)lines(density(x,bw=1,kernel="epanechnikov",na.rm=TRUE))
lines1EUSA <- colwise(lines1E)(USArrests[,3:4])`
Currently the code produces with colwise() just one color. How can I get each line with another color? Or is there ab better way to plot several density lines with different colors?
I don't quite follow your example, so I've created my own example data set. First, create a matrix with three columns:
m = matrix(rnorm(60), ncol=3)
Then plot the density of the first column:
plot(density(m[,1]), col=2)
Using your lines1E function as a template:
lines1E = function(x) {lines(density(x))}
We can add multiple curves to the plot:
colwise(lines1E)(as.data.frame(m[ ,2:3]))
Personally, I would just use:
##Added in NA for illustration
m = matrix(rnorm(60), ncol=3)
m[1,] = NA
plot(density(m[,1], na.rm=T))
sapply(2:ncol(m), function(i) lines(density(m[,i], na.rm=T), col=i))
to get:

Trying to determine why my heatmap made using heatmap.2 and using breaks in R is not symmetrical

I am trying to cluster a protein dna interaction dataset, and draw a heatmap using heatmap.2 from the R package gplots. My matrix is symmetrical.
Here is a copy of the data-set I am using after it is run through pearson:DataSet
Here is the complete process that I am following to generate these graphs: Generate a distance matrix using some correlation in my case pearson, then take that matrix and pass it to R and run the following code on it:
library(RColorBrewer);
library(gplots);
library(MASS);
args <- commandArgs(TRUE);
matrix_a <- read.table(args[1], sep='\t', header=T, row.names=1);
mtscaled <- as.matrix(scale(matrix_a))
# location <- args[2];
# setwd(args[2]);
pdf("result.pdf", pointsize = 15, width = 18, height = 18)
mycol <- c("blue","white","red")
my.breaks <- c(seq(-5, -.6, length.out=6),seq(-.5999999, .1, length.out=4),seq(.100009,5, length.out=7))
#colors <- colorpanel(75,"midnightblue","mediumseagreen","yellow")
result <- heatmap.2(mtscaled, Rowv=T, scale='none', dendrogram="row", symm = T, col=bluered(16), breaks=my.breaks)
dev.off()
The issue I am having is once I use breaks to help me control the color separation the heatmap no longer looks symmetrical.
Here is the heatmap before I use breaks, as you can see the heatmap looks symmetrical:
Here is the heatmap when breaks are used:
I have played with the cutoff's for the sequences to make sure for instance one sequence does not end exactly where the other begins, but I am not able to solve this problem. I would like to use the breaks to help bring out the clusters more.
Here is an example of what it should look like, this image was made using cluster maker:
I don't expect it to look identical to that, but I would like it if my heatmap is more symmetrical and I had better definition in terms of the clusters. The image was created using the same data.
After some investigating I noticed was that after running my matrix through heatmap, or heatmap.2 the values were changing, for example the interaction taken from the provided data set of
Pacdh-2
and
pegg-2
gave a value of 0.0250313 before the matrix was sent to heatmap.
After that I looked at the matrix values using result$carpet and the values were then
-0.224333135
-1.09805379
for the two interactions
So then I decided to reorder the original matrix based on the dendrogram from the clustered matrix so that I was sure that the values would be the same. I used the following stack overflow question for help:
Order of rows in heatmap?
Here is the code used for that:
rowInd <- rev(order.dendrogram(result$rowDendrogram))
colInd <- rowInd
data_ordered <- matrix_a[rowInd, colInd]
I then used another program "matrix2png" to draw the heatmap:
I still have to play around with the colors but at least now the heatmap is symmetrical and clustered.
Looking into it even more the issue seems to be that I was running scale(matrix_a) when I change my code to just be mtscaled <- as.matrix(matrix_a) the result now looks symmetrical.
I'm certainly not the person to attempt reproducing and testing this from that strange data object without code that would read it properly, but here's an idea:
..., col=bluered(20)[4:20], ...
Here's another though which should return the full rand of red which tha above strategy would not:
shift.BR<- colorRamp(c("blue","white", "red"), bias=0.5 )((1:16)/16)
heatmap.2( ...., col=rgb(shift.BR, maxColorValue=255), .... )
Or you can use this vector:
> rgb(shift.BR, maxColorValue=255)
[1] "#1616FF" "#2D2DFF" "#4343FF" "#5A5AFF" "#7070FF" "#8787FF" "#9D9DFF" "#B4B4FF" "#CACAFF" "#E1E1FF" "#F7F7FF"
[12] "#FFD9D9" "#FFA3A3" "#FF6C6C" "#FF3636" "#FF0000"
There was a somewhat similar question (also today) that was asking for a blue to red solution for a set of values from -1 to 3 with white at the center. This it the code and output for that question:
test <- seq(-1,3, len=20)
shift.BR <- colorRamp(c("blue","white", "red"), bias=2)((1:20)/20)
tpal <- rgb(shift.BR, maxColorValue=255)
barplot(test,col = tpal)
(But that would seem to be the wrong direction for the bias in your situation.)

Resources