I'm using ggplot to create a heat-map style plot, and would like to add a second legend with the data scaled a different way. I'm wondering if there is a simple way to do this.
I do not believe that this is a duplicate of other "multiple legends" questions e.g. Multiple legends for a ggplot in R as crucially I want to add extra legends for the same aesthetic - i.e. one aesthetic mapping, two legends.
Example code
# Create a dataframe with some dummy data
x <- c()
y <- c()
for(i in 1:100){
for(j in 1:100){
x <- c(x, i)
y <- c(y, j)
}
}
example_data <- data.frame(x, y)
example_data$z <- example_data$x*example_data$y
example_data$z_rescale <- example_data$z*0.5
Now we've got some data that I'd like to plot as a heatmap with "z" as a colour gradient.
ggplot(example_data, aes(x = x, y = y, fill = z)) +
geom_tile() +
scale_fill_gradient(low = "blue", high = "red") +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0))
Doing the same with the rescaled z gives an identical plot, but with the rescaled legend:
ggplot(example_data, aes(x = x, y = y, fill = z_rescale)) +
geom_tile() +
scale_fill_gradient(low = "blue", high = "red") +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0))
What I'd like to do however is have a single plot showing the two different legends, which would look something like this mock-up:
Now, I imagine this would be possible by creating two plots, finding the grob that represents the legend in one of the plots and cunningly adding it to the second plot... however, is there a much simpler way that I'm overlooking?
Many thanks!
Please add the code
aes(color = z_rescale) +
scale_color_gradient(low = "blue", high = "red") +
after geom_tile() line and you will get the desired
Related
I make a heatmap within ggplot that looks something like that:
set.seed(1)
n <- 100
df <- data.frame(x= rnorm(n), y= rnorm(n), z= rnorm(n))
df$z[1:20] <- NA
ggplot() +
geom_point(data= subset(df, !is.na(z)), mapping= aes(x, y, col= z)) +
geom_point(data= subset(df, is.na(z)), mapping= aes(x, y), col= "black", shape= 4)
Here the points of the scatterplot change colours depending on values in z. Further, I added black crosses (shape 4) to visualize missings in z. I want to add the same x as in the plot to the legend saying missing above it. How can we do this?
I'd actually put the overall command in the same geom_point function using the shape parameter and then manually set the parameters with scale_shape_manual:
ggplot(df) +
geom_point(aes(x, y, colour = z, shape = is.na(z))) +
scale_shape_manual(name = "Missing values", values=c(16, 4))
I am pretty sure that this is easy to do but I can't seem to find a proper way to query this question into google or stack, so here we are:
I have a plot made in ggplot2 which makes use of geom_jitter(), efficiently creating one row for each element in a factor and plotting its values.
I would like to add a complementary geom_violin() to the plot, but just adding the extra geom_ function to the plot code returns two layers: the jitter and the violin, one on top of the other (as usually expected).
EDIT:
This is how the plot looks like:
How can I have the violin as a separate row, without generating a second plot?
Side quest: how I can I have the jitter and the violin geoms interleaved? (i.e. element A jitter row followed by element A violin row, and then element B jitter row followed by element B violin row)
This is the minimum required code to make it (without all the theme() embellishments):
P1 <- ggplot(data=TEST_STACK_SUB, aes(x=E, y=C, col=A)) +
theme(... , aspect.ratio=0.3) +
geom_point(position = position_jitter(w = 0.30, h = 0), alpha=0.2, size=0.5) +
geom_violin(data=TEST_STACK_SUB, mapping=aes(x=E, y=C), position="dodge") +
scale_x_discrete() +
scale_y_continuous(limits=c(0,1), breaks=seq(0,1,0.1),
labels=c(seq(0,1,0.1))) +
scale_color_gradient2(breaks=seq(0,100,20),
limits=c(0,100),
low="green3",
high="darkorchid4",
midpoint=50,
name="") +
coord_flip()
options(repr.plot.width=8, repr.plot.height=2)
plot(P1)
Here is a subset of the data to generate it (for you to try):
data
How about manipulating your factor as a continuous variable and nudging the entries across the aes() calls like so:
library(dplyr)
library(ggplot2)
set.seed(42)
tibble(x = rep(c(1, 3), each = 10),
y = c(rnorm(10, 2), rnorm(10))) -> plot_data
ggplot(plot_data) +
geom_jitter(aes(x = x - 0.5, y = y), width = 0.25) +
geom_violin(aes(x = x + 0.5, y = y, group = x), width = 0.5) +
coord_flip() +
labs(x = "x") +
scale_x_continuous(breaks = c(1, 3),
labels = paste("Level", 1:2),
trans = scales::reverse_trans())
This is a continuation of a question I recently asked (Manually assigning colors with scale_fill_manual only works for certain hexagon sizes).
I was unable to plot geom_hex() so that all hexagons were the same size. Someone solved the problem. However, their solution removed the legend key. Now, I am unable to keep all the hexagons the same size while also retaining the legend.
To be specific, I really want to keep the legend labels sensical. In the example below, the legend has values (0,2,4,6,8,20), rather than hexadecimal labels (#08306B, #08519C, etc).
Below is MWE illustrating the problem. At the end, as per the 3 comments, you can see that I am able to 1) Create a plot with consistent hexagon sizes but no legend, 2) Create a plot with legend, but inconsistent hexagon sizes, 3) Attempt to create a plot with consistent hexagon sizes and legend but fail:
library(ggplot2)
library(hexbin)
library(RColorBrewer)
library(reshape)
set.seed(1)
xbins <- 10
x <- abs(rnorm(10000))
y <- abs(rnorm(10000))
minVal <- min(x, y)
maxVal <- max(x, y)
maxRange <- c(minVal, maxVal)
buffer <- (maxRange[2] - maxRange[1]) / (xbins / 2)
bindata = data.frame(x=x,y=y,factor=as.factor(1))
h <- hexbin(bindata, xbins = xbins, IDs = TRUE, xbnds = maxRange, ybnds = maxRange)
counts <- hexTapply (h, bindata$factor, table)
counts <- t (simplify2array (counts))
counts <- melt (counts)
colnames (counts) <- c ("factor", "ID", "counts")
counts$factor =as.factor(counts$factor)
hexdf <- data.frame (hcell2xy (h), ID = h#cell)
hexdf <- merge (counts, hexdf)
my_breaks <- c(2, 4, 6, 8, 20, 1000)
clrs <- brewer.pal(length(my_breaks) + 3, "Blues")
clrs <- clrs[3:length(clrs)]
hexdf$countColor <- cut(hexdf$counts, breaks = c(0, my_breaks, Inf), labels = rev(clrs))
# Has consistent hexagon sizes, but no legend
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(stat="identity", fill=hexdf$countColor) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
# Has legend, but inconsistent hexagon sizes
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(data=hexdf, stat="identity", aes(fill=countColor)) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
# One attempt to create consistent hexagon sizes and retain legend
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(data=hexdf, aes(fill=countColor)) + geom_hex(stat="identity", fill=hexdf$countColor) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
Any suggestions on how to keep the hexagon sizes consistent while retaining the legend would be very helpful!
Wow, this is an interesting one -- geom_hex seems to really dislike mapping color/fill onto categorical variables. I assume that's because it is designed to be a two-dimensional histogram and visualize continuous summary statistics, but if anyone has any insight into what's going on behind the scenes, I would love to know.
For your specific problem, that really throws a wrench in the works, because you're attempting to have categorical colorization that assigns non-linear groups to the individual hexagons. Conceptually, you might consider why you're doing that. There may be a good reason, but you're essentially taking a linear color gradient and mapping it non-linearly onto your data, which can end up being visually misleading.
However, if that is what you want to do, the best approach I could come up with was to create a new continuous variable that mapped linearly onto your chosen colors and then use those to create a color gradient. Let me try to walk you through my thought process.
You essentially have a continuous variable (counts) that you want to map onto colors. That's easy enough with a simple color gradient, which is the default in ggplot2 for continuous variables. Using your data:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=counts))
yields something close.
However, the bins with really high counts wash out the gradient for points with much lower counts, so we need to change the way the gradient maps colors onto values. You've already declared the colors you want to use in the clrs variable; we just need to add a column to your data frame to use in conjunction with these colors to create a smooth gradient. I did that as follows:
all_breaks <- c(0, my_breaks)
breaks_n <- 1:length(all_breaks)
get_break_n <- function(n) {
break_idx <- max(which((all_breaks - n) < 0))
breaks_n[break_idx]
}
hexdf$bin <- sapply(hexdf$counts, get_break_n)
We create the bin variable as the index of the break that is nearest the count variable without exceeding it. Now, you'll notice that:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin))
is getting much closer to the goal.
The next step is to change how the color gradient maps onto that bin variable, which we can do by adding a call to scale_fill_gradientn:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1])) # odd color reversal to
# match OP's color mapping
This takes a vector of colors between which you want to interpolate a gradient. The way we've set it up, the points along the interpolation will perfectly match up with the unique values of the bin variable, meaning each value will get one of the colors specified.
Now we're cooking with gas, and the only thing left to do is add the various bells and whistles from the original graph. Most importantly, we need to make the legend look the way we want. This requires three things: (1) changing it from the default color bar to a discretized legend, (2) specifying our own custom labels, and (3) giving it an informative title.
# create the custom labels for the legend
all_break_labs <- as.character(all_breaks[1:(length(allb)-1)])
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1]),
guide="legend", # (1) make legend discrete
labels=all_break_labs, # (2) specify labels
name="Count") + # (3) legend title
# All the other prettification from the OP
geom_abline(intercept = 0, color = "red", size = 0.25) +
labs(x = "A", y = "C") +
coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)),
ylim = c(-0.5, (maxRange[2]+buffer))) +
theme(aspect.ratio=1)
All of this leaves us with the following graph:
Hopefully that helps you out. For completeness, here's the new code in full:
# ... the rest of your code before the plots
clrs <- clrs[3:length(clrs)]
hexdf$countColor <- cut(hexdf$counts,
breaks = c(0, my_breaks, Inf),
labels = rev(clrs))
### START OF NEW CODE ###
# create new bin variable
all_breaks <- c(0, my_breaks)
breaks_n <- 1:length(all_breaks)
get_break_n <- function(n) {
break_idx <- max(which((all_breaks - n) < 0))
breaks_n[break_idx]
}
hexdf$bin <- sapply(hexdf$counts, get_break_n)
# create legend labels
all_break_labs <- as.character(all_breaks[1:(length(all_breaks)-1)])
# create final plot
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1]),
guide="legend",
labels=all_break_labs,
name="Count") +
geom_abline(intercept = 0, color = "red", size = 0.25) +
labs(x = "A", y = "C") +
coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)),
ylim = c(-0.5, (maxRange[2]+buffer))) +
theme(aspect.ratio=1)
I am trying to make a plot using several contour levels with geom_contour. Each of these levels defines a zone onto which I plot points with geom_point. My problem is that I don't manage to have on the same plot a color scale for the points and one for the levels, either the same or another.
MWE:
X <- data.frame(x1 = rnorm(1e4), x2 = rnorm(1e4))
X$z <- sqrt(rowSums(X^2))
X$level <- factor(floor(X$z))
xplot <- yplot <- c(-80:80)/10
df_plot = data.frame(expand.grid(x1=xplot, x2=yplot))
df_plot$z = sqrt(rowSums(df_plot^2))
# plot several contour
ggplot(data = df_plot, aes(x1,x2)) + geom_contour(aes(z=z, color=..level..), breaks = c(1:5))
# plot points with colors corresponding to zone
ggplot(data = X, aes(x1,x2)) + geom_point(aes(color=level))
# plot both
ggplot(data = X, aes(x1,x2)) + geom_point(aes(color=level)) +
geom_contour(data = df_plot, aes(z=z), breaks = 1:5)
On this third plot I'd like to have the levels with the same colors as the points, or at least an other color scale. I've tried to put color= in and out aes but it does not change anything.
thanks
The issue here is that you are mixing a discrete and a continuous colour scale (for the points and the contours, respectively) and ggplot2 uses different defaults for the two. By making the colour scale for the contours discrete as well, you can get the same colours:
ggplot(data = X, aes(x = x1, y = x2)) + geom_point(aes(colour = level)) +
geom_contour(data = df_plot, aes(z = z, colour = factor(..level.. - 1)),
breaks = 0:5, size = 1)
Note that I have reduced the number of points and increased the thickness of the lines to make the lines better visible
This is a slightly long winded way of getting what you want, but you get there in the end.
ggplot(data = X, aes(x1,x2)) +
geom_point(aes(color=level)) + # Now add each contour separately.
geom_contour(data = df_plot, aes(z=z), breaks = 1, colour=rainbow(8)[1]) +
geom_contour(data = df_plot, aes(z=z), breaks = 2, colour=rainbow(8)[2]) +
scale_colour_manual(values=rainbow(8))
I have a plot with three different lines. I want one of those lines to have points on as well. I also want the two lines without points to be thicker than the one without points. I have managed to get the plot I want, but I the legend isn't keeping up.
library(ggplot2)
y <- c(1:10, 2:11, 3:12)
x <- c(1:10, 1:10, 1:10)
testnames <- c(rep('mod1', 10), rep('mod2', 10), rep('meas', 10))
df <- data.frame(testnames, y, x)
ggplot(data=df, aes(x=x, y=y, colour=testnames)) +
geom_line(aes(size=testnames)) +
scale_size_manual("", values=c(0.5,1,1)) +
geom_point(aes(alpha=testnames), size=5, shape=4) +
scale_alpha_manual("", values=c(1, 0, 0))
I can remove the second (black) legend:
ggplot(data = df, aes(x=x, y=y, colour=testnames)) +
geom_line(aes(size=testnames)) +
scale_size_manual("", values=c(0.5,1,1), guide='none') +
geom_point(aes(alpha=testnames), size=5, shape=4) +
scale_alpha_manual("", values=c(1, 0.05, 0.05), guide='none')
But what I really want is a merge of the two legends - a legend with colours, cross only on the first variable (meas) and the lines of mod1 and mod2 thicker than the first line. I have tried guide and override, but with little luck.
You don't need transparency to hide the shapes for mod1 and mod2. You can omit these points from the plot and legend by setting their shape to NA in scale_shape_manual:
ggplot(data = df, aes(x = x, y = y, colour = testnames, size = testnames)) +
geom_line() +
geom_point(aes(shape = testnames), size = 5) +
scale_size_manual(values=c(0.5, 2, 2)) +
scale_shape_manual(values=c(8, NA, NA))
This gives the following plot:
NOTE: I used some more distinct values in the size-scale and another shape in order to better illustrate the effect.