R: heatmap.2 change color key - r

I have a question about the package gplots. I want to use the function heatmap.2 and therefore I want to change my symmetric point in color key from 0 to 1. Normally when symkey=TRUE and you use the col=redgreen(), a colorbar is created where the colors are managed like this:
red = -2 to -0.5
black=-0.5 to 0.5
green= 0.5 to 2
Now i want to create a colorbar like this:
red= -1 to 0.8
black= 0.8 to 1.2
green= 1.2 to 3
Is something like this possible?
Thank you!

If you look at the heatmap.2 help file, it looks like you want the breaks argument. From the help file:
breaks (optional) Either a numeric vector indicating the splitting points for binning x into colors, or a integer number of break points to be used, in which case the break points will be spaced equally between min(x) and max(x)
So, you use breaks to specify the cutoff points for each colour. e.g.:
library(gplots)
# make up a bunch of random data from -1, -.9, -.8, ..., 2.9, 3
# 10x10
x = matrix(sample(seq(-1,3,by=.1),100,replace=TRUE),ncol=10)
# plot. We want -1 to 0.8 being red, 0.8 to 1.2 being black, 1.2 to 3 being green.
heatmap.2(x, col=redgreen, breaks=c(-1,0.8,1.2,3))
The crucial bit is the breaks=c(-1,0.8,1.2,3) being your cutoffs.

Related

gnuplot: default value for heatmap (if no data is available)

Can you tell me how to specify the default cb (or z) value?
I build a 3d chart {x,y,z} or {x,y,cb}, but for different x there are different ranges of y, and as a result white bars are visible on the chart (for heatmap/colorbox). I would like to see no white stripes, and where there is no data, gnuplot would substitute the default value (for example, 0) and, accordingly, paint the field with the appropriate color for heatmap
You have several options, depending on exactly what plot mode you are using and what type of data you have. In general you can use two properties of the color assignment to get what you want:
1) out-of-bound values are mapped to the color of the extreme min or max of the colorbar. So one option is to assign a palette that has your desired "default" color at the min and max, independent of whatever palette function you use for the rest of the range
2) data values that are "missing" or "not-a-number" generally leave a hole in the grid of a pixel image or heat map that lets the background color show through.
There is a demo imageNaN.dem in the standard demo set that shows use of these features for several 2D and 3D heat map commands. The output from a heatmap generated by splot $matrixdata matrix with image is shown below.You can see extreme values pinned to the min/max of the colorbar range.
Note that if you want some color other than the backgroundn to show through, you could position a colored rectangle behind the heat map surface.
# Define the test data as a named data block
$matrixdata << EOD
0 5 4 3 0
? 2 2 0 1
Junk 1 2 3 5
NaN 0 0 3 0
Inf 3 2 0 3
-Inf 0 1 2 3
EOD
set view map
set datafile missing '?'
unset xtics
set ytics ("0" 0.0, "?" 1.0, "Junk" 2.0, "NaN" 3.0, "Inf" 4.0, "-Inf" 5.0)
set cblabel "Score"
set cbrange [ -2.0 : 7.0 ]
splot $matrixdata matrix using 1:2:(0):3 with image
#Ethan, I really don't have some data, which results in white slits.
I can fill in the missing data 0 at the stage of forming the data file, but then some files become very large and gnuplot spends all the memory.
So I'm looking for a way to solve the problem.
My example:
For #Ethan: my code:
set arrow from 0,86400 rto graph 1, graph 0 nohead ls 5 front
#===> decision of problem
set object rectangle from graph 0, graph 0 to graph 1, graph 1 behind fc rgbcolor 'blue' fs noborder
set pm3d map
# set pm3d interpolate 32,32
set size square
set palette rgbformulae 22,13,-31
splot inputFullPath u 2:1:(percentage($4)) notitle
and my data (for example):
0 1 0.1
0 2 0.2
0 4 0.5
# -------- {0,5..7} - white gap
# -------- {1,1..3} - white gap
1 3 0.6
1 4 0.5
1 7 0.9

How to change the contours and legend in mathematica contour plot?

The ContourPlot function in Mathematica automatically gets you a legend and contours with colors on the plot which are uniformly distributed ( for example, blue color from 0.1 to 0.2 function values, green from 0.2 to 0.3 and etc.) In my case, function, that I plot, has a large number of values in the 0.1 to 0.2 and only few from 0.2 to 1. If I want to distinguish better values from 0.1 to 0.2 and make several colors for this section, and make the values from 0.2 to 1 by one color, how should I do this?
I would use the Mathematica function Hue[z] to assign a color to your contours. To do this, you're going to use the option ColorFunction, like this:
ContourPlot[myFunction, {x,-10,10}, {y,-10,10}, ColorFunction -> Function[{f},Hue[g[f]]]]
In this code, g[f] is some function that maps the contour level to a hue (a value between 1 and 255). You said you wanted many values between 0 and 0.2, and only a few between 0.2 and 1, so I would use something like
g[f_] := 100*(5*f)^(1/4)
Obviously you can change this to fit. If this doesn't help, you may need to increase the number of contours, using the option Contours->n, where n is how many you want. Hope this helps!

heatmap.2 specify row order OR prevent reorder?

I'm trying to generate some plots of log-transformed fold-change data using heatmap.2 (code below).
I'd like to order the rows in the heatmap by the values in the last column (largest to smallest). The rows are being ordered automatically (I'm unsure the precise calculation used 'under the hood') and as shown in the image, there is some clustering being performed.
sample_data
gid 2hrs 4hrs 6hrs 8hrs
1234 0.5 0.75 0.9 2
2234 0 0 1.5 2
3234 -0.5 0.1 1 3
4234 -0.2 -0.2 0.4 2
5234 -0.5 1.2 1 -0.5
6234 -0.5 1.3 2 -0.3
7234 1 1.2 0.5 2
8234 -1.3 -0.2 2 1.2
9234 0.2 0.2 0.2 1
0123 0.2 0.2 3 0.5
code
data <- read.csv(infile, sep='\t',comment.char="#")
rnames <- data[,1] # assign labels in column 1 to "rnames"
mat_data <- data.matrix(data[,2:ncol(data)]) # transform columns into a matrix
rownames(mat_data) <- rnames # assign row names
# custom palette
my_palette <- colorRampPalette(c("turquoise", "yellow", "red"))(n = 299)
# (optional) defines the color breaks manually for a "skewed" color transition
col_breaks = c(seq(-4,-1,length=100), # for red
seq(-1,1,length=100), # for yellow
seq(1,4,length=100)) # for green
# plot data
heatmap.2(mat_data,
density.info="none", # turns off density plot inside color legend
trace="none", # turns off trace lines inside the heat map
margins =c(12,9), # widens margins around plot
col=my_palette, # use on color palette defined earlier
breaks=col_breaks, # enable color transition at specified limits
dendrogram='none', # only draw a row dendrogram
Colv=FALSE) # turn off column clustering
Plot
I'm wondering if anyone can suggest either how to turn off reordering so I can reorder my matrix by the last column and force this order to be used, or alternatively hack the heatmap.2 function to do this.
You are not specifying Rowv=FALSE and by default the rows are reordered (in heatmap.2 help, for parameter Rowv :
determines if and how the row dendrogram should be reordered. By
default, it is TRUE, which implies dendrogram is computed and
reordered based on row means. If NULL or FALSE, then no dendrogram is
computed and no reordering is done.
So if you want to have the rows ordered according to the last columns, you can do :
mat_data<-mat_data[order(mat_data[,ncol(mat_data)],decreasing=T),]
and then
heatmap.2(mat_data,
density.info="none",
trace="none",
margins =c(12,9),
col=my_palette,
breaks=col_breaks,
dendrogram='none',
Rowv=FALSE,
Colv=FALSE)
You will get the following image :

how to change the size, color of points in a scatter plot in R

You can find the example data in below
I want to color, recognise those points higher than 0 in another color and lower than 0 in another color. Is there any way to know which points are they ? I simply want to add a border higher and lower -1 and then say show those point higher than 1 in another color and print their name close to it while the same for lower than -1 but another color
This comment did not help since make read line randomly
x=(1:990)
cl = 1*(z>0) + 2*(z<=0)
cx = 1*(z>0) + 1.2*(z<=0)
plot(y~x, col=cl, cex=cx)
I don't want to generate red and black points around zero.
I want to detect those points higher and lower than 1 and -1 respectively.
I also want to plot them in different color and different size
Generate some data around 0:
d<-rnorm(1000,0,1)
To get the points higher than 0:
d[d>0]
To identify the index of points higher than 0:
which(d>0)
Plot points above 0 in green below 0 in red. Also, points above 0 will be a different size than points below 0:
s <- character(length(d))
s[d>0] <- "green"
s[d<0] <- "red"
# s[d > -0.5 & d < 0.5] <- "black" # to color points between 0.5 and -0.5 black
plot(d, col=s) # color effect only
sz <- numeric(length(d))
sz[d>0] <- 4 # I'm giving points greater than 0 a size of 4
sz[d<0] <- 1
plot(d, col=s, cex=sz) # size and color effect
Now, you also mention points above and below 1 and -1, respectively. You should be able to follow the code above to do what you want.
To add labels to points meeting a certain condition (e.g. greater than or less than 0.2 and -0.2, respectively), you can use the text function:
text(which(abs(d) > .2), d[abs(d) > .2], cex = 0.5, pos=3)
pos = 3 means to put the label above the point, and the cex argument to text is for adjusting the label size.
As the comments mentioned, there are many ways of doing this. Assuming that you are using the plot() function, here's a simple way of doing what you want. The key is to understand the arguments of plot(). Color of points is determined by col, size by cex, and so forth. These should all be vectors of the same size of y (else the recycling rule is used). See ?plot.
N = 999 # I don't care how many obs you have
y = rnorm(N)
# vector of colors (black for y>0, red for y<=0)
cl = 1*(y>0) + 1.2*(y<=0)
# vector of point sizes relative to default (1 for y>0, 1.2 y<=0)
cx = 1*(y>0) + 1.2*(y<=0)
plot(y, col=cl, cex=cx)
Edit:
I tried to give a general example (eg, coloring points by a third variable), but OP insists he had 2 variables. Well, just rename z by say x.
Edit:
# last edit I make
set.seed(1)
y = rnorm(N)
cl = rep(1, length(y))
cl[y > 0.5] = 2
cl[y < -0.5] = 3
plot(y, col=cl)
And here's what it gives:

How to separate the two leftmost bins of a histogram in R

Suppose I need to plot a dataset like below:
set.seed(1)
dataset <- sample(1:7, 1000, replace=T)
hist(dataset)
As you can see in the plot below, the two leftmost bins do not have any space between them unlike the rest of the bins.
I tried changing xlim, but it didn't work. Basically I would like to have each number (1 to 7) represented as a bin, and additionally, I would like any two adjacent bins to have space beween them...Thanks!
The best way is to set the breaks argument manually. Using the data from your code,
hist(dataset,breaks=rep(1:7,each=2)+c(-.4,.4))
gives the following plot:
The first part, rep(1:7,each=2), is what numbers you want the bars centered around. The second part controls how wide the bars are; if you change it to c(-.49,.49) they'll almost touch, if you change it to c(-.3,.3) you get narrower bars. If you set it to c(-.5,.5) then R yells at you because you aren't allowed to have the same number in your breaks vector twice.
Why does this work?
If you split up the breaks vector, you get one part that looks like this:
> rep(1:7,each=2)
[1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7
and a second part that looks like this:
> c(-.4,.4)
[1] -0.4 0.4
When you add them together, R loops through the second vector as many times as needed to make it as long as the first vector. So you end up with
1-0.4 1+0.4 2-0.4 2+0.4 3-0.4 3+0.4 [etc.]
= 0.6 1.4 1.6 2.4 2.6 3.4 [etc.]
Thus, you have one bar from 0.6 to 1.4--centered around 1, with width 2*.4--another bar from 1.6 to 2.4 centered around 2 with with 2*.4, and so on. If you had data in between (e.g. 2.5) then the histogram would look kind of silly, because it would create a bar from 2.4 to 2.6, and the bar widths would not be even (since that bar would only be .2 wide, while all the others are .8). But with only integer values that's not a problem.
You need six bars NOT seven bars; that is what your histogram has space for. But then you end up generating seven bars. That is the bug.
do sample(1:6, 1000, replace=T) instead of sample(1:7, 1000, replace=T)
If you do need seven bars, then seed with 0

Resources