HeatMap not displaying correctly using ggplot() - r

I am having a strange situation when I am trying to plot a heatmap on a dataset that I have which can be found here.
I am using the following code to plot the heat map:
xaxis<-c('density')
midrange<-range(red[,xaxis])
xaxis <- c(xaxis,'quality')
molten<-melt(red[,xaxis],'quality')
p <- ggplot(molten, aes(x = value, y = quality))
p <- p + geom_tile(aes(fill = value), colour = "white")
p <- p + theme_minimal()
# turn y-axis text 90 degrees (optional, saves space)
p <- p + theme(axis.text.y = element_text(angle = 90, hjust = 0.5))
# remove axis titles, tick marks, and grid
p <- p + theme(axis.title = element_blank())
p <- p + theme(axis.ticks = element_blank())
p <- p + theme(panel.grid = element_blank())
p <- p + scale_y_discrete(expand = c(0, 0))
# optionally remove row labels (not useful depending on molten)
p <- p + theme(axis.text.x = element_blank())
# get diverging color scale from colorbrewer
# #008837 is green, #7b3294 is purple
palette <- c("#008837", "#b7f7f4", "#b7f7f4", "#7b3294")
if(midrange[1] == midrange[2]) {
# use a 3 color gradient instead
p <- p + scale_fill_gradient2(low = palette[1], mid = palette[2], high = palette[4], midpoint = midrange[1]) +
xlim(midrange[1],midrange[2])
}else{
# use a 4 color gradient (with a swath of white in the middle)
p <- p + scale_fill_gradientn(colours = palette, values = c(0, midrange[1], midrange[2], 1)) +
xlim(midrange[1],midrange[2])
}
p
I am trying to plot the heat map on the variable Density and would like to use the variable quality as separation in my heat map. When I use the above code, I get the following plot:
It can be clearly seen that it is a blank image. This is happening because the range of the variable Density is very low, it doesn't happen if I change the variable to the one having a wider range (pH for example).
Should ggplot automatically adjust to this? If not, how can I get ggplot to show the real plot?
Any help in this regard will be much appreciated.

So there are (at least) two problems here.
First, you have almost 1600 tiles in the x-direction, so specifying color="white" for the outline means that all you see is the outline, hence, white. Try taking this out.
Second, in your values=c(...) argument to scale_fill_gradientn(...) you seem to expect the midrange[1] and midrange[2] to be between (0,1), but midrange[2] = 1.003.
After taking out color="white" from the call to geom_tile(...), I get this:

Related

Average line for 2D Histogram?

I am a very new user of "R" and have a question.I am currently working on making 2D Histograms on R. The material necessarily does not matter but how do I plot an average line on the 2D Histogram. The code I am running is this:
load("mydatabin.RData")
# Color housekeeping
library(RColorBrewer)
rf <- colorRampPalette(rev(brewer.pal(11,'Spectral')))
r <- rf(32)
# Create normally distributed data for plotting
x <- mydata$AGE
y <- mydata$BP
df <- data.frame(x,y)
# Plot
plot(df, pch=16, col='black', cex=0.5)
This gives me a scatter plot and then to turn it into a 2D Histogram I do:
library(ggplot2)
# Default call (as object)
p <- ggplot(df, aes(x,y))
h3 <- p + stat_bin2d()
h3
# Default call (using qplot)
qplot(x,y,data=df, geom='bin2d')
After this I do:
h3 <- p + stat_bin2d(bins=25) + scale_fill_gradientn(colours=r)
h3
to add color.
Therefore, from here how do I plot an average line of the data.
And if anyone can tell me how to plot a heat map that looks like this using mydatebin.RData:
Thanks.
You can use geom_hline or geom_vline in ggplot2, passing y/xintercept as a parameter to draw a line. In your case, the parameter can be an average of one of your column to draw an average line. See the code for the example.
I also played around and tried two different ways to draw 2D histograms. Yours seems better and more precise, though I removed colorBrewer.
library(ggplot2)
# Create normally distributed data for plotting
x <- rnorm(10000)
y <- rnorm(10000)
df <- data.frame(x,y)
# stat_density2d way, with average lines
p1 <- ggplot(df,aes(x=x,y=y))+
stat_density2d(aes(fill=..level..), geom="polygon") +
scale_fill_gradient(low="navy", high="yellow") +
# Here go average lines
geom_hline(yintercept = mean(df$y), color = "red") +
geom_vline(xintercept = mean(df$x), color = "red") +
# Just to remove grid and set background color
theme(line = element_blank(),
panel.background = element_rect(fill = "navy"))
p1
# stat_bin2d way, with average lines
p2 <- ggplot(df, aes(x,y)) +
stat_bin2d(bins=50) +
scale_fill_gradient(low="navy", high="yellow") +
# Here go average lines
geom_hline(yintercept = mean(df$y), color = "red") +
geom_vline(xintercept = mean(df$x), color = "red") +
# Just to remove grid and set background color
theme(line = element_blank(),
panel.background = element_rect(fill = "navy"))
p2

plotting labels outside of plot in ggplot

I have a plot made from the following code:
variable=c("A","B","C","D","E")
value=c(1,2,3,4,5);
type=c("A","B","A","A","B")
temp<-data.frame(var=factor(variable),val=value,type=factor(type))
p<-ggplot(temp,aes(var,val,color=type))+geom_point(aes(colour="type"))
p<-p+coord_flip()+theme(plot.margin = unit(c(1,5,1,1), "lines"),legend.position = "none")
How can I labels for the values (now on x-axis) of the plot on the right-side of the plot at the correct level (ie, i want it to say "5 4 3 2 1" vertically on the right side at the level (height) of the corresponding variable?
Thanks
if you make the "variable" the y-axis label rather than the actual values of the plot, you can use the sec_axis as a 1:1 transformation:
temp <- data.frame(val = value, var = value, type = type)
p <- ggplot(temp,aes(var,val,color=type)) +
geom_point(aes(colour="type")) +
theme(plot.margin = unit(c(1,5,1,1), "lines"), legend.position = "none")
p <- p + scale_y_continuous(labels = variable, sec.axis = sec_axis(~.*1))
p

Cannot remove grey area behind legend symbol when using smooth

I'm using ggplot2 with a GAM smooth to look at the relationship between two variables. When plotting I'd like to remove the grey area behind the symbol for the two types of variables. For that I would use theme(legend.key = element_blank()), but that doesn't seem to work when using a smooth.
Can anyone tell me how to remove the grey area behind the two black lines in the legend?
I have a MWE below.
library(ggplot2)
len <- 10000
x <- seq(0, len-1)
df <- as.data.frame(x)
df$y <- 1 - df$x*(1/len)
df$y <- df$y + rnorm(len,sd=0.1)
df$type <- 'method 1'
df$type[df$y>0.5] <- 'method 2'
p <- ggplot(df, aes(x=x, y=y)) + stat_smooth(aes(lty=type), col="black", method = "auto", size=1, se=TRUE)
p <- p + theme_classic()
p <- p + theme(legend.title=element_blank())
p <- p + theme(legend.key = element_blank()) # <--- this doesn't work?
p
Here is a very hacky workaround, based on the notion that if you map things to aestethics in ggplot, they appear in the legend. geom_smooth has a fill aesthetic which allows for different colourings of different groups if one so desires. If it's hard to fix that downstream, sometimes it's easier to keep those unwanted items out of the legend altogether. In your case, the color of the se appeared in the legend. As such, I've created two geom_smooths. One without a line color (but grouped by type) to create the plotted se's, and one with linetype mapped to aes but se set to false.
p <- ggplot(df, aes(x=x, y=y)) +
#first smooth; se only
stat_smooth(aes(group=type),col=NA, method = "auto", size=1, se=TRUE)+
#second smooth: line only
stat_smooth(aes(lty=type),col="black", method = "auto", size=1, se=F) +
theme_classic() +
theme(
legend.title = element_blank(),
legend.key = element_rect(fill = NA, color = NA)) #thank you #alko989

ggplot geom_text font size control

I tried to change the font to 10 for the labels of my bar plot in ggplot2 by doing something like this:
ggplot(data=file,aes(x=V1,y=V3,fill=V2)) +
geom_bar(stat="identity",position="dodge",colour="white") +
geom_text(aes(label=V2),position=position_dodge(width=0.9),
hjust=1.5,colour="white") +
theme_bw()+theme(element_text(size=10))
ggsave(filename="barplot.pdf",width=4,height=4)
but the resulting image has super big font size for the bar plot labels.
Then I thought of modifying in geom_text() with this:
geom_text(size=10,aes(label=V2),position=position_dodge(width=0.9),
hjust=1.5,colour="white")
The label font is even bigger...
I can change the size within geom_text to something like 3 and now it looks like font 10, similar to the axis labels.
I'm wondering what's going on? Does theme(text=element_text(size=10)) doesn't apply to labels?
And why size of 10 in geom_text() is different from that in theme(text=element_text()) ?
Here are a few options for changing text / label sizes
library(ggplot2)
# Example data using mtcars
a <- aggregate(mpg ~ vs + am , mtcars, function(i) round(mean(i)))
p <- ggplot(mtcars, aes(factor(vs), y=mpg, fill=factor(am))) +
geom_bar(stat="identity",position="dodge") +
geom_text(data = a, aes(label = mpg),
position = position_dodge(width=0.9), size=20)
The size in the geom_text changes the size of the geom_text labels.
p <- p + theme(axis.text = element_text(size = 15)) # changes axis labels
p <- p + theme(axis.title = element_text(size = 25)) # change axis titles
p <- p + theme(text = element_text(size = 10)) # this will change all text size
# (except geom_text)
For this And why size of 10 in geom_text() is different from that in theme(text=element_text()) ?
Yes, they are different. I did a quick manual check and they appear to be in the ratio of ~ (14/5) for geom_text sizes to theme sizes.
So a horrible fix for uniform sizes is to scale by this ratio
geom.text.size = 7
theme.size = (14/5) * geom.text.size
ggplot(mtcars, aes(factor(vs), y=mpg, fill=factor(am))) +
geom_bar(stat="identity",position="dodge") +
geom_text(data = a, aes(label = mpg),
position = position_dodge(width=0.9), size=geom.text.size) +
theme(axis.text = element_text(size = theme.size, colour="black"))
This of course doesn't explain why? and is a pita (and i assume there is a more sensible way to do this)
Take a look at the relevant entry in ggplot2's customization FAQ: https://ggplot2.tidyverse.org/articles/faq-customising.html#what-is-the-default-size-of-geom_text-and-how-can-i-change-the-font-size-of-geom_text
You can modify the default size of geom_text() by placing update_geom_defaults("text", list(size = X), where X is your choice of new size, at the beginning of your script.

overlaying plots in ggplot2

How to overlay one plot on top of the other in ggplot2 as explained in the following sentences? I want to draw the grey time series on top of the red one using ggplot2 in R (now the red one is above the grey one and I want my graph to be the other way around). Here is my code (I generate some data in order to show you my problem, the real dataset is much more complex):
install.packages("ggplot2")
library(ggplot2)
time <- rep(1:100,2)
timeseries <- c(rep(0.5,100),rep(c(0,1),50))
upper <- c(rep(0.7,100),rep(0,100))
lower <- c(rep(0.3,100),rep(0,100))
legend <- c(rep("red should be under",100),rep("grey should be above",100))
dataset <- data.frame(timeseries,upper,lower,time,legend)
ggplot(dataset, aes(x=time, y=timeseries)) +
geom_line(aes(colour=legend, size=legend)) +
geom_ribbon(aes(ymax=upper, ymin=lower, fill=legend), alpha = 0.2) +
scale_colour_manual(limits=c("grey should be above","red should be under"),values = c("grey50","red")) +
scale_fill_manual(values = c(NA, "red")) +
scale_size_manual(values=c(0.5, 1.5)) +
theme(legend.position="top", legend.direction="horizontal",legend.title = element_blank())
Convert the data you are grouping on into a factor and explicitly set the order of the levels. ggplot draws the layers according to this order. Also, it is a good idea to group the scale_manual codes to the geom it is being applied to for readability.
legend <- factor(legend, levels = c("red should be under","grey should be above"))
c <- data.frame(timeseries,upper,lower,time,legend)
ggplot(c, aes(x=time, y=timeseries)) +
geom_ribbon(aes(ymax=upper, ymin=lower, fill=legend), alpha = 0.2) +
scale_fill_manual(values = c("red", NA)) +
geom_line(aes(colour=legend, size=legend)) +
scale_colour_manual(values = c("red","grey50")) +
scale_size_manual(values=c(1.5,0.5)) +
theme(legend.position="top", legend.direction="horizontal",legend.title = element_blank())
Note that the ordering of the values in the scale_manual now maps to "grey" and "red"

Resources