Showing separate legend for a geom_text layer? - r

I have the following plot:
library(ggplot2)
ib<- data.frame(
category = factor(c("Cat1","Cat2","Cat1", "Cat1", "Cat2","Cat1","Cat1", "Cat2","Cat2")),
city = c("CITY1","CITY1","CITY2","CITY3", "CITY3","CITY4","CITY5", "CITY6","CITY7"),
median = c(1.3560, 2.4830, 0.7230, 0.8100, 3.1480, 1.9640, 0.6185, 1.2205, 2.4000),
samplesize = c(851, 1794, 47, 189, 185, 9, 94, 16, 65)
)
p<-ggplot(data=ib, aes(x=city, y=category, size=median, colour=category, label=samplesize)) +
geom_point(alpha=.6) +
scale_area(range=c(1,15)) +
scale_colour_hue(guide="none") +
geom_text(aes(size = 1), colour="black")
p
(I'm plotting the circles proportional to a median value and overlaying with a text label representing the sample size. image at http://imgur.com/T82cF)
Is there any way to SEPARATE the two legends? I would like one legend (labeled "median") to give the scale of circles, and the other legend with a single letter "a" (or even better a number) which I could label "sample size". Since the two properties are not related in any way, it doesn't make sense to bundle them in the same legend.
I've tried all sorts of combinations but the best I can come up with is loosing the text legend altogether :)
thanks for the answer!

Updated scale_area has been deprecated; scale_size used instead. The gtable function gtable_filter() is used to extract the legends. And modified code used to replace default legend key in one of the legends.
If you are still looking for an answer to your question, here's one that seems to do most of what you want, although it's a bit of a hack in places. The symbol in the legend can be changes using kohske's comment here
The difficulty was trying to apply the two different size mappings. So, I've left the dot size mapping inside the aesthetic statement but removed the label size mapping from the aesthetic statement. This means that label size has to be set according to discrete values of a factor version of samplesize (fsamplesize). The resulting chart is nearly right, except the legend for label size (i.e., samplesize) is not drawn. To get round that problem, I drew a chart that contained a label size mapping according to the factor version of samplesize (but ignoring the dot size mapping) in order to extract its legend which can then be inserted back into the first chart.
## Your data
ib<- data.frame(
category = factor(c("Cat1","Cat2","Cat1", "Cat1", "Cat2","Cat1","Cat1", "Cat2","Cat2")),
city = c("CITY1","CITY1","CITY2","CITY3", "CITY3","CITY4","CITY5", "CITY6","CITY7"),
median = c(1.3560, 2.4830, 0.7230, 0.8100, 3.1480, 1.9640, 0.6185, 1.2205, 2.4000),
samplesize = c(851, 1794, 47, 189, 185, 9, 94, 16, 65)
)
## Load packages
library(ggplot2)
library(gridExtra)
library(gtable)
library(grid)
## Obtain the factor version of samplesize.
ib$fsamplesize = cut(ib$samplesize, breaks = c(0, 100, 1000, Inf))
## Obtain plot with dot size mapped to median, the label inside the dot set
## to samplesize, and the size of the label set to the discrete levels of the factor
## version of samplesize. Here, I've selected three sizes for the labels (3, 6 and 10)
## corresponding to samplesizes of 0-100, 100-1000, >1000. The sizes of the labels are
## set using three call to geom_text - one for each size.
p <- ggplot(data=ib, aes(x=city, y=category)) +
geom_point(aes(size = median, colour = category), alpha = .6) +
scale_size("Median", range=c(0, 15)) +
scale_colour_hue(guide = "none") + theme_bw()
p1 <- p +
geom_text(aes(label = ifelse(samplesize > 1000, samplesize, "")),
size = 10, color = "black", alpha = 0.6) +
geom_text(aes(label = ifelse(samplesize < 100, samplesize, "")),
size = 3, color = "black", alpha = 0.6) +
geom_text(aes(label = ifelse(samplesize > 100 & samplesize < 1000, samplesize, "")),
size = 6, color = "black", alpha = 0.6)
## Extracxt the legend from p1 using functions from the gridExtra package
g1 = ggplotGrob(p1)
leg1 = gtable_filter(g1, "guide-box")
## Keep p1 but dump its legend
p1 = p1 + theme(legend.position = "none")
## Get second legend - size of the label.
## Draw a dummy plot, using fsamplesize as a size aesthetic. Note that the label sizes are
## set to 3, 6, and 10, matching the sizes of the labels in p1.
dummy.plot = ggplot(data = ib, aes(x = city, y = category, label = samplesize)) +
geom_point(aes(size = fsamplesize), colour = NA) +
geom_text(show.legend = FALSE) + theme_bw() +
guides(size = guide_legend(override.aes = list(colour = "black", shape = utf8ToInt("N")))) +
scale_size_manual("Sample Size", values = c(3, 6, 10),
breaks = levels(ib$fsamplesize), labels = c("< 100", "100 - 1000", "> 1000"))
## Get the legend from dummy.plot using functions from the gridExtra package
g2 = ggplotGrob(dummy.plot)
leg2 = gtable_filter(g2, "guide-box")
## Arrange the three components (p1, leg1, leg2) using functions from the gridExtra package
## The two legends are arranged using the inner arrangeGrob function. The resulting
## chart is then arranged with p1 in the outer arrrangeGrob function.
ib.plot = arrangeGrob(p1, arrangeGrob(leg1, leg2, nrow = 2), ncol = 2,
widths = unit(c(9, 2), c("null", "null")))
## Draw the graph
grid.newpage()
grid.draw(ib.plot)

This actually doesn't directly address your question, but it is how I might go about creating a graph with the general characteristics you describe:
ib$ss <- paste("n = ",ib$samplesize,sep = "")
ggplot(data=ib, aes(x=city, y=category, size=median, colour=category, label=ss)) +
geom_point(alpha=.6) +
geom_text(size = 2, vjust = -1.2,colour="black") +
scale_colour_hue(legend = FALSE)
I removed the scale_area piece, as I'm not sure what purpose it served and it was causing errors for me.
So the rationale here is that the sample size information feels more like an annotation to me than something that deserves its own scale and legend. Opinions may differ on that, of course, but I thought I'd put it out there in case you find it useful.

This too doesn't answer your question. I've left samplesize inside the circle. Also, samplesize to me is more like an annotation than a legend.
But I think you are using an old version of ggplot2. There have been some changes in ggplot2 version 0.9.0. I've made the changes below.
p<-ggplot(data=ib, aes(x=city, y=category, size=median, colour=category, label=samplesize)) +
geom_point(alpha=.6) +
scale_area(range = c(1,15)) + # range instead of to
scale_colour_hue(guide = "none") + # guide instead of legend
geom_text(size = 2.5, colour="black")
p

Related

ggplot2 geom_points won't colour or dodge

So I'm using ggplot2 to plot both a bar graph and points. I'm currently getting this:
As you can see the bars are nicely separated and colored in the desired colors. However my points are all uncolored and stacked ontop of eachother. I would like the points to be above their designated bar and in the same color.
#Add bars
A <- A + geom_col(aes(y = w1, fill = factor(Species1)),
position = position_dodge(preserve = 'single'))
#Add colors
A <- A + scale_fill_manual(values = c("A. pelagicus"= "skyblue1","A. superciliosus"="dodgerblue","A. vulpinus"="midnightblue","Alopias sp."="black"))
#Add points
A <- A + geom_point(aes(y = f1/2.5),
shape= 24,
size = 3,
fill = factor(Species1),
position = position_dodge(preserve = 'single'))
#change x and y axis range
A <- A + scale_x_continuous(breaks = c(2000:2020), limits = c(2016,2019))
A <- A + expand_limits(y=c(0,150))
# now adding the secondary axis, following the example in the help file ?scale_y_continuous
# and, very important, reverting the above transformation
A <- A + scale_y_continuous(sec.axis = sec_axis(~.*2.5, name = " "))
# modifying axis and title
A <- A + labs(y = " ",
x = " ")
A <- A + theme(plot.title = element_text(size = rel(4)))
A <- A + theme(axis.text.x = element_text(face="bold", size=14, angle=45),
axis.text.y = element_text(face="bold", size=14))
#A <- A + theme(legend.title = element_blank(),legend.position = "none")
#Print plot
A
When I run this code I get the following error:
Error: Unknown colour name: A. pelagicus
In addition: Warning messages:
1: Width not defined. Set with position_dodge(width = ?)
2: In max(table(panel$xmin)) : no non-missing arguments to max; returning -Inf
I've tried a couple of things but I can't figure out it does work for geom_col and not for geom_points.
Thanks in advance
The two basic problems you have are dealing with your color error and not dodging, and they can be solved by formatting your scale_...(values= argument using a list instead of a vector, and applying the group= aesthetic, respectively.
You'll see the answer to these two question using an example:
# dummy dataset
year <- c(rep(2017, 4), rep(2018, 4))
species <- rep(c('things', 'things1', 'wee beasties', 'ew'), 2)
values <- c(10, 5, 5, 4, 60, 10, 25, 7)
pt.value <- c(8, 7, 10, 2, 43, 12, 20, 10)
df <-data.frame(year, species, values, pt.value)
I made the "values" set for my column heights and I wanted to use a different y aesthetic for points for illustrative purposes, called "pt.value". Otherwise, the data setup is similar to your own. Note that df$year will be set as numeric, so it's best to change that into either Date format (kinda more trouble than it's worth here), or just as a factor, since "2017.5" isn't gonna make too much sense here :). The point is, I need "year" to be discrete, not continuous.
Solve the color error
For the plot, I'll try to create it similar to you. Here note that in the scale_fill_manual object, you have to set the values= argument using a list. In your example code, you are using a vector (c()) to specify the colors and naming. If you have name1=color1, name2=color2,..., this represents a list structure.
ggplot(df, aes(x=as.factor(year), y=values)) +
geom_col(aes(fill=species), position=position_dodge(width=0.62), width=0.6) +
scale_fill_manual(values=
list('ew' = 'skyblue1', 'things' = 'dodgerblue',
'things1'='midnightblue', 'wee beasties' = 'gray')) +
geom_point(aes(y=pt.value), shape=24, position=position_dodge(width=0.62)) +
theme_bw() + labs(x='Year')
So the colors are applied correctly and my axis is discrete, and the y values of the points are mapped to pt.value like I wanted, but why don't the points dodge?!
Solve the dodging issue
Dodging is a funny thing in ggplot2. The best reasoning here I can give you is that for columns and barplots, dodging is sort of "built-in" to the geom, since the default position is "stack" and "dodge" represents an alternative method to draw the geom. For points, text, labels, and others, the default position is "identity" and you have to be more explicit in how they are going to dodge or they just don't dodge at all.
Basically, we need to let the points know what they are dodging based on. Is it "species"? With geom_col, it's assumed to be, but with geom_point, you need to specify. We do that by using a group= aesthetic, which let's the geom_point know what to use as criteria for dodging. When you add that, it works!
ggplot(df, aes(x=as.factor(year), y=values, group=species)) +
geom_col(aes(fill=species), position=position_dodge(width=0.62), width=0.6) +
scale_fill_manual(values=
list('ew' = 'skyblue1', 'things' = 'dodgerblue',
'things1'='midnightblue', 'wee beasties' = 'gray')) +
geom_point(aes(y=pt.value), shape=24, position=position_dodge(width=0.62)) +
theme_bw() + labs(x='Year')

Scale geom_point() size to increase size based on distance from zero

I'd like to plot some measures that have been standardized to z-scores. I want the size of the point in geom_point() to increase from 0 to 3, and also to increase from 0 to -3. I also want the colour to change from red, to blue. The trick is to get both to work together.
Here is an example that's as close as I can get to what I'd like, note that the size of the point increases from -2, whereas I want the size of the point to increase as the z_score moves away from zero.
library(tidyverse)
year <- rep(c(2015:2018), each = 3)
parameters <- rep(c("length", "weight", "condition"), 4)
z_score <- runif(12, min = -2, max = 2)
df <- tibble(year, parameters, z_score)
cols <- c("#d73027",
"darkgrey",
"#4575b4")
ggplot(df, aes(year, parameters, colour = z_score, size = z_score)) +
geom_point() +
scale_colour_gradientn(colours = cols) +
theme(legend.position="bottom") +
scale_size(range = c(1,15)) +
guides(color= guide_legend(), size=guide_legend())
bubble plot output
One trick I tried was to use the absolute value of z_score which scaled the points correctly but messed up the legend.
Here's what I'd like the legend and points size to be scaled to, though I'd like the colour to be a gradient as in my example. Any insight would be greatly appreciated!
Link to plot legend
You were very close. In order to adjust the size of the points in the legend, use the override.aes option in the guides function.
library(ggplot2)
year <- rep(c(2015:2018), each = 3)
parameters <- rep(c("length", "weight", "condition"), 4)
z_score <- runif(12, min = -2, max = 2)
df <- tibble(year, parameters, z_score)
cols <- c("#d73027", "darkgrey", "#4575b4")
ggplot(df, aes(year, parameters, colour = z_score)) +
geom_point( size=abs(5*df$z_score)) + # times 5 to increase size
scale_colour_gradientn(colours = cols) +
theme(legend.position="bottom") +
scale_size(range = c(1,15)) +
guides(color=guide_legend(override.aes = list(size = c( 5, 1, 5))) )
In order to suppress the legend being print for the size attribute, I moved it outside the aes, field. This works for this example, one will have to adjust the size=c(...) to match the number of division in the legend.
This should answer your question and get you most of the way there on answering your question.

Create a concentric circle legend for a ggplot bubble chart

I am trying to recreate this visualization of a bubble chart using ggplot2 (I have found the code for doing this in R, but not with the ggplot2 package). This is what I have so far. There are some other errors with my code at the moment, but I want to have the legend show concentric circles for size, versus circles shown in rows. Thanks for your help!
Original visualization:
My reproduction:
My (simplified) code:
crime <-
read.csv("http://datasets.flowingdata.com/crimeRatesByState2005.tsv",
header=TRUE, sep="\t")
ggplot(crime,
mapping= aes(x=murder, y=burglary))+
geom_point(aes(size=population), color="red")+
geom_text(aes(label=state.name), show.legend=FALSE, size=3)+
theme(legend.position = c(0.9, 0.2))
Here's an approach where we build the legend as imagined from scratch.
1) This part slightly tweaks your base chart.
Thank you for including the source data. I missed that earlier and have edited this answer to use it. I switched to a different point shape so that we can specify both outside border (color) as well as interior fill.
gg <- ggplot(crime,
mapping= aes(x=murder, y=burglary))+
geom_point(aes(size=population), shape = 21, color="white", fill = "red")+
ggrepel::geom_text_repel(aes(label = state.name),
size = 3, segment.color = NA,
point.padding = unit(0.1, "lines")) +
theme_classic() +
# This scales area to size (not radius), specifies max size, and hides legend
scale_size_area(max_size = 20, guide = FALSE)
2) Here I make another table to use for the concentric legend circles
library(dplyr); library(ggplot2)
legend_bubbles <- data.frame(
label = c("3", "20", "40m"),
size = c(3E6, 20E6, 40E6)
) %>%
mutate(radius = sqrt(size / pi))
3) This section adds the legend bubbles, text, and title.
It's not ideal, since different print sizes will require placement tweaks. But it seems like it'd get complicated to get into the underlying grobs with ggplot_build to extract and use those sizing adjustments...
gg + geom_point(data = legend_bubbles,
# The "radius/50" was trial and error. Better way?
aes(x = 8.5, y = 250 + radius/50, size = size),
shape = 21, color = "black", fill = NA) +
geom_text(data = legend_bubbles, size = 3,
aes(x = 8.5, y = 275 + 2 * radius/50, label = label)) +
annotate("text", x = 8.5, y = 450, label = "Population", fontface = "bold")

Ggplot2 in R gives incorrect coloring when creating overlapping demographic pyramids

I am creating an overlapping demographic pyramids in R with ggplot2 library to compare demographic data from two different sources.
I have however run in to problems with ggplot2 and the colouring when using the alpha-parameter. I have tried to make sense of ggplot2 and geom_bar structure, but so far it has gotten me nowhere. The deal is to draw four geom_bars where two geom_bars are overlapping each other (males and females, respectively). I'd have no problems if I didn't need use alpha to demonstrate differences in my data.
I would really appreciate some answers where I am going wrong here. As a R programmer I am pretty close to beginner, so bear with me if my code looks weird.
Below is my code which results in the image also shown below. I have altered my demographic data to be random for this question.
library(ggplot2)
# Here I randomise my data for StackOverflow
poptest<-data.frame(matrix(NA, nrow = 101, ncol = 5))
poptest[,1]<- seq(0,100)
poptest[,2]<- rpois(n = 101, lambda = 100)
poptest[,3]<- rpois(n = 101, lambda = 100)
poptest[,4]<- rpois(n = 101, lambda = 100)
poptest[,5]<- rpois(n = 101, lambda = 100)
colnames(poptest) <- c("age","A_males", "A_females","B_males", "B_females")
myLimits<-c(-250,250)
myBreaks<-seq(-250,250,50)
# Plot demographic pyramid
poptestPlot <- ggplot(data = poptest) +
geom_bar(aes(age,A_females,fill="black"), stat = "identity", alpha=0.75, position = "identity")+
geom_bar(aes(age,-A_males, fill="black"), stat = "identity", alpha=0.75, position="identity")+
geom_bar(aes(age,B_females, fill="white"), stat = "identity", alpha=0.5, position="identity")+
geom_bar(aes(age,-B_males, fill="white"), stat = "identity", alpha=0.5, position="identity")+
coord_flip()+
#set the y-axis which (because of the flip) shows as the x-axis
scale_y_continuous(name = "",
limits = myLimits,
breaks = myBreaks,
#give the values on the y-axis a name, to remove the negatives
#give abs() command to remove negative values
labels = paste0(as.character(abs(myBreaks))))+
#set the x-axis which (because of the flip) shows as the y-axis
scale_x_continuous(name = "age",breaks=seq(0,100,5)) +
#remove the legend
theme(legend.position = 'none')+
# Annotate geom_bars
annotate("text", x = 100, y = -200, label = "males",size=6)+
annotate("text", x = 100, y = 200, label = "females",size=6)
# show results in a separate window
x11()
print(poptestPlot)
This is what I get as result: (sorry, as a StackOverflow noob I can't embed my pictures)
Ggplot2 result
The colouring is really nonsensical. Black is not black and white is not white. Instead it may use some sort of default coloring because R or ggplot2 can't interpret my code.
I welcome any and all answers. Thank you.
You are trying to map "black" to data points. That means you would have to add a manual scale and tell ggplot to colour each instance of "black" in colour "black". There is a shortcut for this called scale_colour_identity. However, if this is your only level, it is much easier to just use fill outside the aes. This way the whole geom is filled in black or white respectively:
poptestPlot <- ggplot(data = poptest) +
geom_bar(aes(age,A_females),fill="black", stat = "identity", alpha=0.75, position = "identity")+
geom_bar(aes(age,-A_males), fill="black", stat = "identity", alpha=0.75, position="identity")+
geom_bar(aes(age,B_females), fill="white", stat = "identity", alpha=0.5, position="identity")+
geom_bar(aes(age,-B_males), fill="white", stat = "identity", alpha=0.5, position="identity")+
coord_flip()+
#set the y-axis which (because of the flip) shows as the x-axis
scale_y_continuous(name = "",
limits = myLimits,
breaks = myBreaks,
#give the values on the y-axis a name, to remove the negatives
#give abs() command to remove negative values
labels = paste0(as.character(abs(myBreaks))))+
#set the x-axis which (because of the flip) shows as the y-axis
scale_x_continuous(name = "age",breaks=seq(0,100,5)) +
#remove the legend
theme(legend.position = 'none')+
# Annotate geom_bars
annotate("text", x = 100, y = -200, label = "males",size=6)+
annotate("text", x = 100, y = 200, label = "females",size=6)

Consistent hexagon sizes and legend for manually assignment of colors

This is a continuation of a question I recently asked (Manually assigning colors with scale_fill_manual only works for certain hexagon sizes).
I was unable to plot geom_hex() so that all hexagons were the same size. Someone solved the problem. However, their solution removed the legend key. Now, I am unable to keep all the hexagons the same size while also retaining the legend.
To be specific, I really want to keep the legend labels sensical. In the example below, the legend has values (0,2,4,6,8,20), rather than hexadecimal labels (#08306B, #08519C, etc).
Below is MWE illustrating the problem. At the end, as per the 3 comments, you can see that I am able to 1) Create a plot with consistent hexagon sizes but no legend, 2) Create a plot with legend, but inconsistent hexagon sizes, 3) Attempt to create a plot with consistent hexagon sizes and legend but fail:
library(ggplot2)
library(hexbin)
library(RColorBrewer)
library(reshape)
set.seed(1)
xbins <- 10
x <- abs(rnorm(10000))
y <- abs(rnorm(10000))
minVal <- min(x, y)
maxVal <- max(x, y)
maxRange <- c(minVal, maxVal)
buffer <- (maxRange[2] - maxRange[1]) / (xbins / 2)
bindata = data.frame(x=x,y=y,factor=as.factor(1))
h <- hexbin(bindata, xbins = xbins, IDs = TRUE, xbnds = maxRange, ybnds = maxRange)
counts <- hexTapply (h, bindata$factor, table)
counts <- t (simplify2array (counts))
counts <- melt (counts)
colnames (counts) <- c ("factor", "ID", "counts")
counts$factor =as.factor(counts$factor)
hexdf <- data.frame (hcell2xy (h), ID = h#cell)
hexdf <- merge (counts, hexdf)
my_breaks <- c(2, 4, 6, 8, 20, 1000)
clrs <- brewer.pal(length(my_breaks) + 3, "Blues")
clrs <- clrs[3:length(clrs)]
hexdf$countColor <- cut(hexdf$counts, breaks = c(0, my_breaks, Inf), labels = rev(clrs))
# Has consistent hexagon sizes, but no legend
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(stat="identity", fill=hexdf$countColor) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
# Has legend, but inconsistent hexagon sizes
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(data=hexdf, stat="identity", aes(fill=countColor)) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
# One attempt to create consistent hexagon sizes and retain legend
ggplot(hexdf, aes(x=x, y=y, hexID=ID, counts=counts, fill=countColor)) + geom_hex(data=hexdf, aes(fill=countColor)) + geom_hex(stat="identity", fill=hexdf$countColor) + scale_fill_manual(labels = as.character(c(0, my_breaks)), values = rev(clrs), name = "Count") + geom_abline(intercept = 0, color = "red", size = 0.25) + labs(x = "A", y = "C") + coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)), ylim = c(-0.5, (maxRange[2]+buffer))) + theme(aspect.ratio=1)
Any suggestions on how to keep the hexagon sizes consistent while retaining the legend would be very helpful!
Wow, this is an interesting one -- geom_hex seems to really dislike mapping color/fill onto categorical variables. I assume that's because it is designed to be a two-dimensional histogram and visualize continuous summary statistics, but if anyone has any insight into what's going on behind the scenes, I would love to know.
For your specific problem, that really throws a wrench in the works, because you're attempting to have categorical colorization that assigns non-linear groups to the individual hexagons. Conceptually, you might consider why you're doing that. There may be a good reason, but you're essentially taking a linear color gradient and mapping it non-linearly onto your data, which can end up being visually misleading.
However, if that is what you want to do, the best approach I could come up with was to create a new continuous variable that mapped linearly onto your chosen colors and then use those to create a color gradient. Let me try to walk you through my thought process.
You essentially have a continuous variable (counts) that you want to map onto colors. That's easy enough with a simple color gradient, which is the default in ggplot2 for continuous variables. Using your data:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=counts))
yields something close.
However, the bins with really high counts wash out the gradient for points with much lower counts, so we need to change the way the gradient maps colors onto values. You've already declared the colors you want to use in the clrs variable; we just need to add a column to your data frame to use in conjunction with these colors to create a smooth gradient. I did that as follows:
all_breaks <- c(0, my_breaks)
breaks_n <- 1:length(all_breaks)
get_break_n <- function(n) {
break_idx <- max(which((all_breaks - n) < 0))
breaks_n[break_idx]
}
hexdf$bin <- sapply(hexdf$counts, get_break_n)
We create the bin variable as the index of the break that is nearest the count variable without exceeding it. Now, you'll notice that:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin))
is getting much closer to the goal.
The next step is to change how the color gradient maps onto that bin variable, which we can do by adding a call to scale_fill_gradientn:
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1])) # odd color reversal to
# match OP's color mapping
This takes a vector of colors between which you want to interpolate a gradient. The way we've set it up, the points along the interpolation will perfectly match up with the unique values of the bin variable, meaning each value will get one of the colors specified.
Now we're cooking with gas, and the only thing left to do is add the various bells and whistles from the original graph. Most importantly, we need to make the legend look the way we want. This requires three things: (1) changing it from the default color bar to a discretized legend, (2) specifying our own custom labels, and (3) giving it an informative title.
# create the custom labels for the legend
all_break_labs <- as.character(all_breaks[1:(length(allb)-1)])
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1]),
guide="legend", # (1) make legend discrete
labels=all_break_labs, # (2) specify labels
name="Count") + # (3) legend title
# All the other prettification from the OP
geom_abline(intercept = 0, color = "red", size = 0.25) +
labs(x = "A", y = "C") +
coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)),
ylim = c(-0.5, (maxRange[2]+buffer))) +
theme(aspect.ratio=1)
All of this leaves us with the following graph:
Hopefully that helps you out. For completeness, here's the new code in full:
# ... the rest of your code before the plots
clrs <- clrs[3:length(clrs)]
hexdf$countColor <- cut(hexdf$counts,
breaks = c(0, my_breaks, Inf),
labels = rev(clrs))
### START OF NEW CODE ###
# create new bin variable
all_breaks <- c(0, my_breaks)
breaks_n <- 1:length(all_breaks)
get_break_n <- function(n) {
break_idx <- max(which((all_breaks - n) < 0))
breaks_n[break_idx]
}
hexdf$bin <- sapply(hexdf$counts, get_break_n)
# create legend labels
all_break_labs <- as.character(all_breaks[1:(length(all_breaks)-1)])
# create final plot
ggplot(hexdf, aes(x=x, y=y)) +
geom_hex(stat="identity", aes(fill=bin)) +
scale_fill_gradientn(colors=rev(clrs[-1]),
guide="legend",
labels=all_break_labs,
name="Count") +
geom_abline(intercept = 0, color = "red", size = 0.25) +
labs(x = "A", y = "C") +
coord_fixed(xlim = c(-0.5, (maxRange[2]+buffer)),
ylim = c(-0.5, (maxRange[2]+buffer))) +
theme(aspect.ratio=1)

Resources