I managed to generate the heatmap in R using heatmap function
( heatmap(heatmap_16m, col=redgreen(75))
to get the following:
As you see, it has a normal distribution of red, black and green colors.
Since heatmap function cannot provide any legend, I switched to heatmap.2 function (heatmap.2(heatmap_16m, col= redgreen(75), trace="none")) and got the following:
Here the color distribution is skewed to mainly red.
So, my question is following: how to get the apperance (legend, row and column dendrogram order) as in second heatmap with the distribution of greens and reds as in first heatmap?
I found the answer accidentally while searching for something else :)
Here it goes:
heatmap.2(heatmap_16m, col= greenred(75), trace="none",
scale="row")
You can also scale by column, depending on the data.
Related
I have a curve, for instance
y_curve=c(1,2,5,6,9,1).
and the colors for each curve point
colors=c("#0000FF","#606060","#606060","#FF0000","#FF0000","#FF0000").
In theory I want to plot a curve where the first half has one color (except for the first point which is blue) and the second half has another color. In my example the dataset has more than 3000 observations so it makes sense.
For some reason, if I plot the data just using the command
plot(y_curve,col=colors), the color of points is plotted corrently.
Nevertheless, if I add the option type="l", the plotted curve has only one color - the blue, which is the first color in the vector colors ("#0000FF").
Does anyone know what am I doing wrong?
So the code is
y_curve=c(1,2,5,6,9,1)
colors=c("#0000FF","#606060","#606060","#FF0000","#FF0000","#FF0000")
plot(y_curve,col=colors,type="l")
Thank you all in advance.
I avoid using ggplot since this part of code is inside an already complicated function and I prefer using the base R commands.
The line option for the plot function does not accept multiple colors.
There is the segments() function that we can use to manually draw in each separate segment individually with a unique color.
y_curve=c(1,2,5,6,9,1)
colors=c("#0000FF","#606060","#606060","#FF0000","#FF0000","#FF0000")
#create a mostly blank plot
plot(y_curve,col=NA)
# Use this to show the points:
#plot(y_curve,col=colors)
#index variable
x = seq_along(y_curve)
#draw the segments
segments(head(x,-1), head(y_curve,-1), x[-1], y_curve[-1], type="l", col=colors)
This answer is based on the solution to this question:
How do I plot a graph in R, with the first values in one colour and the next values in another colour?
I am plotting a heatmap in R using the base R heatmap() function. Is there a way to define more colours so that the heatmap has a greater variation in the colours used. Currently it is using about 10 and the "hottest" area is quite large and dark purple. I want more colours so that this large area itself it broken down into more colours to better differentiate.
Try experimenting with the color palettes of the grDevices package.
library(grDevices)
heatmap(x, col = topo.colors(n))
where n is the number of colors.
Or, alternatively
col = rainbow(n)
col = terrain.colors(n)
col = cm.colors(n)
However, often the problem with differentiation does not depend on the number of colors, but on the data variability: many of them may be clustered in a small range of values. In such case you could try to differentiate them by chosing a subrange or transforming the data, for example by graphing their logaritm.
Examples:
50 colors from cm.colors palette:
heatmap(Ca, col=cm.colors(50), Rowv=NA, Colv=NA)
matrix of log values, with 50 colors from cm.colors palette:
heatmap(log(Ca), col=cm.colors(50), Rowv=NA, Colv=NA)
in which subtler differences can be seen.
I'd like to plot a parallel co-ordinate plot for a dataset mtcars. I want to set a variable on color. I used the code :
library(GGgally)
ggparcoord(data=mtcars, columns=1:10 , groupColumn=11)
It generated the graph but all the lines are in shades of blue. However I have trouble comprehending the graph and making observations due to similar colors used. How can I introduce a different set of colors like blue, green and red etx for the same variable.
You can use the ggparcoord()'s coloring function for this by turning the grouping-column into a factor.
mtcars[,11] <- as.factor(mtcars[,11])
ggparcoord(data=mtcars, columns=1:10 , groupColumn=11)
I have created a bean plot in R using the following
beanplot(windA, side='both', border='NA',
col=list('gray',c('red','white')),
ylab='Wind Speed (m/s)' ,what=c(1,1,1,0),xaxt ='n')
axis(1,at=c(1:12),labels =c ('Jan','Feb','Mar','apr','may','Jun','Jul','Aug','Sep','Oct','Nov','Dec'))
legend('topright', fill=c('gray','red'), legend= c('Measured', 'calc'))
and I get the following image
Is there a way that I can alternate the colors? For example, can I get Jan to be gray and red then Feb to be gray and blue and continue this alternating color scheme for the year?
you could specify the color order you want, col=list('gray','red','grey','blue'), using a sample dataset USArrests from base R, the colors are cycled till all the points are plotted
require(beanplot)
beanplot(USJudgeRatings, side='both', border='NA',
col=list('gray','red','grey','blue'),
ylab='US Judge Ratings' ,what=c(1,1,1,0),xaxt ='n')
I am a newbie to R and I am trying to do some clustering on a data table where rows represent individual objects and columns represent the features that have been measured for these objects. I've worked through some clustering tutorials and I do get some output, however, the heatmap that I get after clustering does not correspond at all to the heatmap produced from the same data table with another programme. While the heatmap of that programme does indicate clear differences in marker expression between the objects, my heatmap doesn't show much differences and I cannot recognize any clustering (i.e., colour) pattern on the heatmap, it just seems to be a randomly jumbled set of colours that are close to each other (no big contrast). Here is an example of the code I am using, maybe someone has an idea on what I might be doing wrong.
mydata <- read.table("mydata.csv")
datamat <- as.matrix(mydata)
datalog <- log(datamat)
I am using log values for the clustering because I know that the other programme does so, too
library(gplots)
hr <- hclust(as.dist(1-cor(t(datalog), method="pearson")), method="complete")
mycl <- cutree(hr, k=7)
mycol <- sample(rainbow(256)); mycol <- mycol[as.vector(mycl)]
heatmap(datamat, Rowv=as.dendrogram(hr), Colv=NA,
col=colorpanel(40, "black","yellow","green"),
scale="column", RowSideColors=mycol)
Again, I plot the original colours but use the log-clusters because I know that this is what the other programme does.
I tried to play around with the methods, but I don't get anything that would at least somehow look like a clustered heatmap. When I take out the scaling, the heatmap becomes extremely dark (and I am actually quite sure that I have somehow to scale or normalize the data by column). I also tried to cluster with k-means, but again, this didn't help. My idea was that the colour scale might not be used completely because of two outliers, but although removing them slightly increased the range of colours plotted on the heatmap, this still did not reveal proper clusters.
Is there anything else I could play around with?
And is it possible to change the colour scale with heatmap so that outliers are found in the last bin that has a range of "everything greater than a particular value"? I tried to do this with heatmap.2 (argument "breaks"), but I didn't quite succeed and also I didn't manage to put the row side colours that I use with the heatmap function.
If you are okay with using heatmap.2 from the gplots package that will allow you to add breaks to assign colors to ranges represented in your heatmap.
For example if you had 3 colors blue, white, and red with the values going from low to high you could do something like this:
my.breaks <- c(seq(-5, -.6, length.out=6),seq(-.5999999, .1, length.out=4),seq(.100009,5, length.out=7))
result <- heatmap.2(mtscaled, Rowv=T, scale='none', dendrogram="row", symm = T, col=bluered(16), breaks=my.breaks)
In this case you have 3 sets of values that correspond to the 3 colors, the values will differ of course depending on what values you have with your data.
One thing you are doing in your program is to call hclust on your data then to call heatmap on it, however if you look in the heatmap manual page it states:
Defaults to hclust.
So I don't think you need to do that. You might want to take a look at some similar questions that I had asked that might help to point you in the right direction:
Heatmap Question 1
Heatmap Question 2
If you post an image of the heatmap you get and an image of the heatmap that the other program is making it will be easier for us to help you out more.