How to overlay one plot on top of the other in ggplot2 as explained in the following sentences? I want to draw the grey time series on top of the red one using ggplot2 in R (now the red one is above the grey one and I want my graph to be the other way around). Here is my code (I generate some data in order to show you my problem, the real dataset is much more complex):
install.packages("ggplot2")
library(ggplot2)
time <- rep(1:100,2)
timeseries <- c(rep(0.5,100),rep(c(0,1),50))
upper <- c(rep(0.7,100),rep(0,100))
lower <- c(rep(0.3,100),rep(0,100))
legend <- c(rep("red should be under",100),rep("grey should be above",100))
dataset <- data.frame(timeseries,upper,lower,time,legend)
ggplot(dataset, aes(x=time, y=timeseries)) +
geom_line(aes(colour=legend, size=legend)) +
geom_ribbon(aes(ymax=upper, ymin=lower, fill=legend), alpha = 0.2) +
scale_colour_manual(limits=c("grey should be above","red should be under"),values = c("grey50","red")) +
scale_fill_manual(values = c(NA, "red")) +
scale_size_manual(values=c(0.5, 1.5)) +
theme(legend.position="top", legend.direction="horizontal",legend.title = element_blank())
Convert the data you are grouping on into a factor and explicitly set the order of the levels. ggplot draws the layers according to this order. Also, it is a good idea to group the scale_manual codes to the geom it is being applied to for readability.
legend <- factor(legend, levels = c("red should be under","grey should be above"))
c <- data.frame(timeseries,upper,lower,time,legend)
ggplot(c, aes(x=time, y=timeseries)) +
geom_ribbon(aes(ymax=upper, ymin=lower, fill=legend), alpha = 0.2) +
scale_fill_manual(values = c("red", NA)) +
geom_line(aes(colour=legend, size=legend)) +
scale_colour_manual(values = c("red","grey50")) +
scale_size_manual(values=c(1.5,0.5)) +
theme(legend.position="top", legend.direction="horizontal",legend.title = element_blank())
Note that the ordering of the values in the scale_manual now maps to "grey" and "red"
Related
I would like to plot densities of two variables ("red_variable", "green_variable") from two independent dataframes on one density plot, using red and green color for the two variables.
This is my attempt at coding:
library(ggplot2)
### Create dataframes
red_dataframe <- data.frame(red_variable = c(10,11,12,13,14))
green_dataframe <- data.frame(green_variable = c(6,7,8,9,10))
mean(red_dataframe$red_variable) # mean is 12
mean(green_dataframe$green_variable) # mean is 8
### Set colors
red_color= "#FF0000"
green_color= "#008000"
### Trying to plot densities with correct colors and correct legend entries
ggplot() +
geom_density(aes(x=red_variable, fill = red_color, alpha=0.5), data=red_dataframe) +
geom_density(aes(x=green_variable, fill = green_color, alpha=0.5), data=green_dataframe) +
scale_fill_manual(labels = c("Density of red_variable", "Density of green_variable"), values = c(red_color, green_color)) +
xlab("X value") +
ylab("Density") +
labs(fill = "Legend") +
guides(alpha=FALSE)
Result: The legend shows correct colors, but the colors on the plot are wrong: The "red" variable is plotted with green color, the "green" variable with red color. The "green" density (mean=8) should appear left and the "red" density (mean=12) on the right on the x-axis. This behavior of the plot doesn't make any sense to me.
I can in fact get the desired result by switching red and green in the code:
### load ggplot2
library(ggplot2)
### Create dataframes
red_dataframe <- data.frame(red_variable = c(10,11,12,13,14))
green_dataframe <- data.frame(green_variable = c(6,7,8,9,10))
mean(red_dataframe$red_variable) # mean is 12
mean(green_dataframe$green_variable) # mean is 8
### Set colors
red_color= "#FF0000"
green_color= "#008000"
### Trying to plot densities with correct colors and correct legend entries
ggplot() +
geom_density(aes(x=red_variable, fill = green_color, alpha=0.5), data=red_dataframe) +
geom_density(aes(x=green_variable, fill = red_color, alpha=0.5), data=green_dataframe) +
scale_fill_manual(labels = c("Density of red_variable", "Density of green_variable"), values = c(red_color, green_color)) +
xlab("X value") +
ylab("Density") +
labs(fill = "Legend") +
guides(alpha=FALSE)
... While the plot makes sense now, the code doesn't. I cannot really trust code doing the opposite of what I would expect it to do. What's the problem here? Am I color blind?
On your code, in order to have color at the right position, you need to specify fill = red_color or fill = green_color (as well as alpha as it is a constant - as pointed out by #Gregor) outside of the aes such as:
...+
geom_density(aes(x=red_variable), alpha=0.5, fill = red_color, data=red_dataframe) +
geom_density(aes(x=green_variable), alpha=0.5, fill = green_color, data=green_dataframe) + ...
Alternatively, you can bind your dataframes together, reshape them into a longer format (much more appropriate to ggplot) and then add color column that you can use with scale_fill_identity function (https://ggplot2.tidyverse.org/reference/scale_identity.html):
df <- cbind(red_dataframe,green_dataframe)
library(tidyr)
library(ggplot2)
library(dplyr)
df <- df %>% pivot_longer(.,cols = c(red_variable,green_variable), names_to = "var",values_to = "val") %>%
mutate(Color = ifelse(grepl("red",var),red_color,green_color))
ggplot(df, aes(val, fill = Color))+
geom_density(alpha = 0.5)+
scale_fill_identity(guide = "legend", name = "Legend", labels = levels(as.factor(df$var)))+
xlab("X value") +
ylab("Density")
Does it answer your question ?
You're trying to use ggplot as if it's base graphics... the mindset shift can take a little while to get used to. dc37's answer shows how you should do it. I'll try to explain what goes wrong in your attempt:
When you put fill = green_color inside aes(), because it's inside aes() ggplot essentially creates a new column of data filled with the green_color values in your green_data_frame, i.e., "#008000", "#008000", "#008000", .... Ditto for the red color values in the red data frame. We can see this if we modify your plot by simply deleting your scale:
ggplot() +
geom_density(aes(x = red_variable, fill = green_color, alpha = 0.5), data =
red_dataframe) +
geom_density(aes(x = green_variable, fill = red_color, alpha = 0.5), data =
green_dataframe) +
xlab("X value") +
ylab("Density") +
labs(fill = "Legend") +
guides(alpha = FALSE)
We can actually get what you want by putting the identity scale, which is designed for the (common in base, rare in ggplot2) case where you actually put color values in the data.
ggplot() +
geom_density(aes(x = red_variable, fill = green_color, alpha = 0.5), data =
red_dataframe) +
geom_density(aes(x = green_variable, fill = red_color, alpha = 0.5), data =
green_dataframe) +
scale_fill_identity() +
xlab("X value") +
ylab("Density") +
labs(fill = "Legend") +
guides(alpha = FALSE)
When you added your scale_fill_manual, ggplot was like "okay, cool, you want to specify colors and labels". But you were thinking in the order that you added the layers to the plot (much like base graphics), whereas ggplot was thinking of these newly created variables "#FF0000" and "#008000", which it ordered alphabetically by default (just as if they were factor or character columns in a data frame). And since you happened to add the layers in reverse alphabetical order, it was switched.
dc37's answer shows a couple better methods. With ggplot you should (a) work with a single, long-format data frame whenever possible (b) don't put constants inside aes() (constant color, constant alpha, etc.), (c) set colors in a scale_fill_* or scale_color_* function when they're not constant.
Hi I would like to plot the following dataframe.
d<- data.frame (pid=c("d","b","c"), type=c("rna","rna","rna"), value = c(1,2,3) )
d2 <- data.frame (pid=c("d","b","c"), type=c("dna","dna","dna"), value = c(10,20,30) )
df <- rbind (d,d2)
ggplot(df, aes(y=pid, x=type ) ) + geom_tile(aes(fill = value),
colour = "white") + scale_fill_gradient(low = "white",
high = "steelblue")
This produces a plot that looks like this,
however, I would like to have each x factor have its own color gradient, so ideally rna is blue to white while dna is red to white. Is there anyway to do this? Of if different gradient is not possible, then what about just different scales? thanks!
Here is what it looks like applying #Brian's suggestion to your original example. You may want to rescale the rna and dna value separately, to make the color ranges more comparable.
p = ggplot(df, aes(y=pid, x=type, fill=type, alpha=value)) +
geom_tile(colour="white", size=1) +
scale_fill_manual(values=c(dna="salmon", rna="steelblue")) +
theme_bw() +
theme(panel.grid=element_blank()) +
coord_cartesian(expand=FALSE)
ggplot(iris, aes(Sepal.Length,
Petal.Width,
color = Species,
alpha = Sepal.Width)) +
geom_point(size = 4)
Also see: How to create a continuous legend (color bar style) for scale_alpha?
I'm using ggplot2 with a GAM smooth to look at the relationship between two variables. When plotting I'd like to remove the grey area behind the symbol for the two types of variables. For that I would use theme(legend.key = element_blank()), but that doesn't seem to work when using a smooth.
Can anyone tell me how to remove the grey area behind the two black lines in the legend?
I have a MWE below.
library(ggplot2)
len <- 10000
x <- seq(0, len-1)
df <- as.data.frame(x)
df$y <- 1 - df$x*(1/len)
df$y <- df$y + rnorm(len,sd=0.1)
df$type <- 'method 1'
df$type[df$y>0.5] <- 'method 2'
p <- ggplot(df, aes(x=x, y=y)) + stat_smooth(aes(lty=type), col="black", method = "auto", size=1, se=TRUE)
p <- p + theme_classic()
p <- p + theme(legend.title=element_blank())
p <- p + theme(legend.key = element_blank()) # <--- this doesn't work?
p
Here is a very hacky workaround, based on the notion that if you map things to aestethics in ggplot, they appear in the legend. geom_smooth has a fill aesthetic which allows for different colourings of different groups if one so desires. If it's hard to fix that downstream, sometimes it's easier to keep those unwanted items out of the legend altogether. In your case, the color of the se appeared in the legend. As such, I've created two geom_smooths. One without a line color (but grouped by type) to create the plotted se's, and one with linetype mapped to aes but se set to false.
p <- ggplot(df, aes(x=x, y=y)) +
#first smooth; se only
stat_smooth(aes(group=type),col=NA, method = "auto", size=1, se=TRUE)+
#second smooth: line only
stat_smooth(aes(lty=type),col="black", method = "auto", size=1, se=F) +
theme_classic() +
theme(
legend.title = element_blank(),
legend.key = element_rect(fill = NA, color = NA)) #thank you #alko989
Consider the following two plots
library(ggplot2)
set.seed(666)
bigx <- data.frame(x=sample(1:12,50,replace=TRUE))
ggplot(bigx, aes(x=x)) +
geom_histogram(fill = "red", colour =
"black",stat="bin",binwidth=2) +
ylab("Frequency") +
xlab("things") +
ylim(c(0,30))
hist(bigx$x)
Why do I get the overhang above 12 on ggplot? When i play with right = TRUE this just shifts the overhang to below zero. I want the simple and simply bounded result from hist() but using ggplot2.
How can I do this?
If your goal is to reproduce the output of hist(...) using ggplot, this will work:
ggplot(bigx, aes(x=x)) +
geom_histogram(fill = "red", colour = "black",stat="bin",
binwidth=2, right=TRUE) +
scale_x_continuous(limits=c(0,12),breaks=seq(0,12,2))
Or, more generally, this:
brks <- hist(bigx$x, plot=F)$breaks
ggplot(bigx, aes(x=x)) +
geom_histogram(fill = "red", colour = "black",stat="bin",
breaks=brks, right=TRUE) +
scale_x_continuous(limits=range(brks),breaks=brks)
Evidently, the ggplot default for histograms is to use right-closed intervals, whereas the default for hist(...) is left closed intervals. Also, ggplot uses a different algorithm for calculating the x-axis breaks and limits.
I am using ggplot's geom_tile to do 2-D density plots faceted by a factor. Every facet's scale goes from the minimum of all the data to the maximum of all the data, but the geom_tile in each facet only extends to the range of the data plotted in that facet.
Example code that demonstrates the problem:
library(ggplot2)
data.unlimited <- data.frame(x=rnorm(500), y=rnorm(500))
data.limited <- subset(data.frame(x=rnorm(500), y=rnorm(500)), x<1 & y<1 & x>-1 & y>-1)
mydata <- rbind(data.frame(groupvar="unlimited", data.unlimited),
data.frame(groupvar="limited", data.limited))
ggplot(mydata) +
aes(x=x,y=y) +
stat_density2d(geom="tile", aes(fill = ..density..), contour = FALSE) +
facet_wrap(~ groupvar)
Run the code, and you will see two facets. One facet shows a density plot of an "unlimited" random normal distribution. The second facet shows a random normal truncated to lie within a 2x2 square about the origin. The geom_tile in the "limited" facet will be confined inside this small box instead of filling the facet.
last_plot() +
scale_x_continuous(limits=c(-5,5)) +
scale_y_continuous(limits=c(-5,5))
These last three lines plot the same data with specified x and y limits, and we see that neither facet extends the tile sections to the edge in this case.
Is there any way to force the geom_tile in each facet to extend to the full range of the facet?
I think you're looking for a combination of scales = "free" and expand = c(0,0):
ggplot(mydata) +
aes(x=x,y=y) +
stat_density2d(geom="tile", aes(fill = ..density..), contour = FALSE) +
facet_wrap(~ groupvar,scales = "free") +
scale_x_continuous(expand = c(0,0)) +
scale_y_continuous(expand = c(0,0))
EDIT
Given the OP's clarification, here's one option via simply setting the panel background manually:
ggplot(mydata) +
aes(x=x,y=y) +
stat_density2d(geom="tile", aes(fill = ..density..), contour = FALSE) +
facet_wrap(~ groupvar) +
scale_fill_gradient(low = "blue", high = "red") +
opts(panel.background = theme_rect(fill = "blue"),panel.grid.major = theme_blank(),
panel.grid.minor = theme_blank())