Create a concentric circle legend for a ggplot bubble chart - r

I am trying to recreate this visualization of a bubble chart using ggplot2 (I have found the code for doing this in R, but not with the ggplot2 package). This is what I have so far. There are some other errors with my code at the moment, but I want to have the legend show concentric circles for size, versus circles shown in rows. Thanks for your help!
Original visualization:
My reproduction:
My (simplified) code:
crime <-
read.csv("http://datasets.flowingdata.com/crimeRatesByState2005.tsv",
header=TRUE, sep="\t")
ggplot(crime,
mapping= aes(x=murder, y=burglary))+
geom_point(aes(size=population), color="red")+
geom_text(aes(label=state.name), show.legend=FALSE, size=3)+
theme(legend.position = c(0.9, 0.2))

Here's an approach where we build the legend as imagined from scratch.
1) This part slightly tweaks your base chart.
Thank you for including the source data. I missed that earlier and have edited this answer to use it. I switched to a different point shape so that we can specify both outside border (color) as well as interior fill.
gg <- ggplot(crime,
mapping= aes(x=murder, y=burglary))+
geom_point(aes(size=population), shape = 21, color="white", fill = "red")+
ggrepel::geom_text_repel(aes(label = state.name),
size = 3, segment.color = NA,
point.padding = unit(0.1, "lines")) +
theme_classic() +
# This scales area to size (not radius), specifies max size, and hides legend
scale_size_area(max_size = 20, guide = FALSE)
2) Here I make another table to use for the concentric legend circles
library(dplyr); library(ggplot2)
legend_bubbles <- data.frame(
label = c("3", "20", "40m"),
size = c(3E6, 20E6, 40E6)
) %>%
mutate(radius = sqrt(size / pi))
3) This section adds the legend bubbles, text, and title.
It's not ideal, since different print sizes will require placement tweaks. But it seems like it'd get complicated to get into the underlying grobs with ggplot_build to extract and use those sizing adjustments...
gg + geom_point(data = legend_bubbles,
# The "radius/50" was trial and error. Better way?
aes(x = 8.5, y = 250 + radius/50, size = size),
shape = 21, color = "black", fill = NA) +
geom_text(data = legend_bubbles, size = 3,
aes(x = 8.5, y = 275 + 2 * radius/50, label = label)) +
annotate("text", x = 8.5, y = 450, label = "Population", fontface = "bold")

Related

Problem of different x-axis position when using grid.arrange and legend on bottom

I have to arrange two plots with same axes next to each other and did this with ggplot2 and grid.arrange. Because of a more tidy representation, the legends have to be placed bottom. Unfortunately some times the left plot has more legend entries than the right one and therefore needs a second line, yielding x-axes on different y positions. Therefore it does not only look untidy, the aim of being able to compare these plots is not fulfilled anymore.
Can anybody help?
plot_left <- some_ggplot2_fct(variable,left) +
theme(legend.position = "bottom")+
theme(legend.background = element_rect(size = 0.5, linetype="solid", colour ="black"))
plot_right <- some_ggplot2_fct(variable,right,f)+
theme(legend.position = "bottom")+
theme(legend.background = element_rect(size = 0.5, linetype="solid", colour ="black"))
# adjust y axis for more easy compare
upper_lim <- max(plot_Volume_right$data$value, plot_Volume_left$data$value)
lower_lim <- min(plot_Volume_right$data$value, plot_Volume_left$data$value)
plot_Volume_left <- plot_Volume_left + ylim(c(lower_lim, upper_lim))
plot_Volume_right <- plot_Volume_right + ylim(c(lower_lim, upper_lim))
# Arrange plots in grid
grid.arrange(plot_Volume_left, plot_Volume_right,
ncol = 2,
top = textGrob(strTitle,
gp = gpar(fontfamily = "Raleway", fontsize = 15, font = 2)))
In the picture you can see the result:
Do you now an easy way to solve this without too much change in code? (The underlying framework is quite large)

R barplot - highest value on top is hidden

Simple barplot with values on top of bars (I know it is silly - I was forced to add them :)). text works good, but value above highest frequency bar is hidden. I tried margins but it moves the whole plot instead of only the graph area. What can you suggest? Thanks!
x = c(28,1,4,17,2)
lbl = c("1","2","3","4+","tough guys\n(type in)")
bp = barplot(x,names.arg=lbl,main="Ctrl-C clicks",col="grey")
text(x = bp, y = x, label = x, pos = 3, cex = 0.8, col = "red",font=2)
Plot example:
You can fix this by extending the ylim
bp = barplot(x,names.arg=lbl,main="Ctrl-C clicks",col="grey", ylim=c(0,30))
Another solution using ggplot2:
library(ggplot2)
x = c(28,1,4,17,2)
lbl = c("1","2","3","4+","tough guys \n(type in)")
test <- data.frame(x, lbl)
bp = ggplot(test, aes(x=lbl, y= x))+
geom_bar(color = "grey", stat="identity")+ ## set color of bars and use the value of the number in the cells.
geom_text(aes(label= x), vjust = -1, color = "red")+
ggtitle("Ctrl-C clicks")+
theme_bw()+ ## give black and white theme
theme(plot.title = element_text(hjust = 0.5),## adjust position of title
panel.grid.minor=element_blank(), ## suppress minor grid lines
panel.grid.major=element_blank() ##suppress major grid lines
)+
scale_y_continuous(limits = c(0,30)) ## set scale limits
bp

Create a rectangle filled with text

I'd like to create a filled rectangle in R, with white text centered in the middle, and export it to png. I know the rect() function can probably do this, but every example I've seen the rectangle is printed on a plot. Is there a way to do this without the plot?
For reference, I'm building a blogdown() site and trying to create a square that looks pretty much identical to those in the Hugrid theme.
You can use geom_rect() to create rectangles and geom_text() to paste text into them. Modifying rectangle look (color, line size or type) in ggplot2 is easy. All you have to do is to remove default ggplot2 look with theme_classsic() and element_blank().
# Generate dummy dataset
foo <- data.frame(x1 = 1, x2 = 2, y1 = 1, y2 = 2,
text = paste(letters[1:3], letters[1:3], collapse = "\n"))
# Plot rectangle with text
library(ggplot2)
ggplot(foo) +
geom_rect(aes(xmin = x1, xmax = x2, ymin = y1, ymax = y2),
color = "black", size = 2, fill = "lightblue") +
geom_text(aes(x = x1 + (x2 - x1) / 2, y = y1 + (y2 - y1) / 2,
label = text),
size = 20) +
theme_classic() +
theme(axis.line = element_blank(),
axis.ticks = element_blank(),
axis.text = element_blank(),
axis.title = element_blank())
Here's a lightweight solution,
rects <- data.frame(fill = RColorBrewer::brewer.pal(5, "Pastel1"),
colour = RColorBrewer::brewer.pal(5, "Set1"),
label = paste("text", 1:5), stringsAsFactors = FALSE)
library(gridExtra)
gl <- mapply(function(f,l,c) grobTree(rectGrob(gp=gpar(fill=f, col="white",lwd=2)),
textGrob(l, gp=gpar(col=c))),
f = rects$fill, l = rects$label, c = rects$colour,
SIMPLIFY = FALSE)
grid.arrange(grobs=gl)
It's not quite clear from your question what exactly the sticking point is. Do you need to generate the rectangles from R (instead of, say, manually in Illustrator)? And no plot window must be shown?
All of this can be achieved. I prefer to draw with ggplot2, and the specific geoms you'd need here are geom_tile() for the rectangles and geom_text() for the text. And you can save to png without generating a plot by using ggsave().
rects <- data.frame(x = 1:4,
colors = c("red", "green", "blue", "magenta"),
text = paste("text", 1:4))
library(ggplot2)
p <- ggplot(rects, aes(x, y = 0, fill = colors, label = text)) +
geom_tile(width = .9, height = .9) + # make square tiles
geom_text(color = "white") + # add white text in the middle
scale_fill_identity(guide = "none") + # color the tiles with the colors in the data frame
coord_fixed() + # make sure tiles are square
theme_void() # remove any axis markings
ggsave("test.png", p, width = 4.5, height = 1.5)
I made four rectangles in this example. If you need only one you can just make an input data frame with only one row.

Generating a color legend with shifted labels using ggplot2

I usually plot maps using GrADS, and usually I use a color legend that in the plot will look like this:
I would like to do the same using ggplot2 in R. If using simply:
g <- g + scale_fill_brewer(palette="Greens", na.value="NA", name=legendtitle)
#g is the previously saved plot with some simple options, prepared with cut()
The output is of course this:
So I'd like to be able to do two things:
Shift the labels so they are between the colors, note indicating the interval above (note: renaming the labels is not the problem, shifting them is)
Last label should be arrow-like, to indicate that entries above the maximum (10 in the example) are indicated in the darkest color (dark green in the example).
EDIT:
Using help from the answer below, I've come to this:
Part 1. of my question is almost solved, even if it does not look perfect. I'd really like to get rid of the white space between the colors, any idea? Part 2... I have no idea whatsoever.
My code snippet uses:
theme(legend.position="bottom", legend.key.width = unit(1, "cm"), legend.key.height = unit(0.3, "cm")) + guides(fill=guide_legend(label.position = "bottom", label.hjust = 1.2))
Here is something to get you started.
The idea is that you use cut() to create the cut points, but specify the labels in the way you desire.
By putting the legend at the bottom of the plot, ggplot automatically puts the legend labels "in between" the values.
library(ggplot2)
dat <- data.frame(x=0:100, y=runif(101, 0, 10), z=seq(0, 12, len=101))
dat$col <- cut(
dat$z,
breaks=c(0, 2, 4, 6, 8, 10, Inf),
labels=c(2, 4, 6, 8, 10, "-->")
)
ggplot(dat, aes(x, y, col=col)) +
geom_point(size=10) +
scale_colour_brewer("", palette="Greens") +
theme(legend.position="bottom")
As you did'nt provide the geom that you want to use in the end, I modified the answer of Andrie a little bit. I included a rectangle geom (e.g. geom_col) to fill the complete legend boxes. The bars are turned off using suitable alpha values of 0:1.
# data
set.seed(1324)
dat <- data.frame(x=0:100, y=runif(101, 0, 10), z=seq(0, 12, len=101))
# add discrete values
dat$col <- cut(include.lowest = T,
dat$z,
breaks=c(0, 2, 4, 6, 8, 10, Inf),
labels=c(2, 4, 6, 8, 10, "-->")
)
# the plot
ggplot(dat, aes(x,y,fill=col)) +
geom_point(aes(col=col),size=8, show.legend = F) +
geom_col(alpha=0)+
scale_fill_brewer("", palette = "Greens")+
scale_colour_brewer("", palette="Greens")+
scale_alpha_discrete(range=c(0,1))+
guides(fill = guide_legend(nrow=1, override.aes = list(alpha = 1),
label.position="bottom",
label.hjust = .5)) +
theme(legend.position="bottom",
legend.key.width = unit(3, "cm"),
legend.key.height = unit(1, "cm"))
The first part of the question can be solved with a continuous color scale. So in addition to your discrete scale, just add a continuous color scale (you may have to relabel it). Then you can put the discrete scale at the top or bottom and you should be set. Here's a reproducible example:
require(scales)
nlvls <- nlevels(diamonds$cut)
ggplot(diamonds, aes(x = price, fill = cut)) +
geom_histogram(position = "dodge", binwidth = 1000) +
scale_fill_brewer(palette="Greens", na.value="NA", guide='none') +
theme(legend.position = 'bottom') +
geom_line(aes(x=price, y=0, color=as.numeric(cut)), linetype=0) +
scale_color_continuous(name = 'cont. scale',
low = brewer_pal(pal = "Greens")(nlvls)[1],
high = brewer_pal(pal = "Greens")(nlvls)[nlvls])
For the arrows, I have absolutely no idea how you would do this with ggplot2. You can probably hack something together with grid, but it may be more trouble than it's worth.

Showing separate legend for a geom_text layer?

I have the following plot:
library(ggplot2)
ib<- data.frame(
category = factor(c("Cat1","Cat2","Cat1", "Cat1", "Cat2","Cat1","Cat1", "Cat2","Cat2")),
city = c("CITY1","CITY1","CITY2","CITY3", "CITY3","CITY4","CITY5", "CITY6","CITY7"),
median = c(1.3560, 2.4830, 0.7230, 0.8100, 3.1480, 1.9640, 0.6185, 1.2205, 2.4000),
samplesize = c(851, 1794, 47, 189, 185, 9, 94, 16, 65)
)
p<-ggplot(data=ib, aes(x=city, y=category, size=median, colour=category, label=samplesize)) +
geom_point(alpha=.6) +
scale_area(range=c(1,15)) +
scale_colour_hue(guide="none") +
geom_text(aes(size = 1), colour="black")
p
(I'm plotting the circles proportional to a median value and overlaying with a text label representing the sample size. image at http://imgur.com/T82cF)
Is there any way to SEPARATE the two legends? I would like one legend (labeled "median") to give the scale of circles, and the other legend with a single letter "a" (or even better a number) which I could label "sample size". Since the two properties are not related in any way, it doesn't make sense to bundle them in the same legend.
I've tried all sorts of combinations but the best I can come up with is loosing the text legend altogether :)
thanks for the answer!
Updated scale_area has been deprecated; scale_size used instead. The gtable function gtable_filter() is used to extract the legends. And modified code used to replace default legend key in one of the legends.
If you are still looking for an answer to your question, here's one that seems to do most of what you want, although it's a bit of a hack in places. The symbol in the legend can be changes using kohske's comment here
The difficulty was trying to apply the two different size mappings. So, I've left the dot size mapping inside the aesthetic statement but removed the label size mapping from the aesthetic statement. This means that label size has to be set according to discrete values of a factor version of samplesize (fsamplesize). The resulting chart is nearly right, except the legend for label size (i.e., samplesize) is not drawn. To get round that problem, I drew a chart that contained a label size mapping according to the factor version of samplesize (but ignoring the dot size mapping) in order to extract its legend which can then be inserted back into the first chart.
## Your data
ib<- data.frame(
category = factor(c("Cat1","Cat2","Cat1", "Cat1", "Cat2","Cat1","Cat1", "Cat2","Cat2")),
city = c("CITY1","CITY1","CITY2","CITY3", "CITY3","CITY4","CITY5", "CITY6","CITY7"),
median = c(1.3560, 2.4830, 0.7230, 0.8100, 3.1480, 1.9640, 0.6185, 1.2205, 2.4000),
samplesize = c(851, 1794, 47, 189, 185, 9, 94, 16, 65)
)
## Load packages
library(ggplot2)
library(gridExtra)
library(gtable)
library(grid)
## Obtain the factor version of samplesize.
ib$fsamplesize = cut(ib$samplesize, breaks = c(0, 100, 1000, Inf))
## Obtain plot with dot size mapped to median, the label inside the dot set
## to samplesize, and the size of the label set to the discrete levels of the factor
## version of samplesize. Here, I've selected three sizes for the labels (3, 6 and 10)
## corresponding to samplesizes of 0-100, 100-1000, >1000. The sizes of the labels are
## set using three call to geom_text - one for each size.
p <- ggplot(data=ib, aes(x=city, y=category)) +
geom_point(aes(size = median, colour = category), alpha = .6) +
scale_size("Median", range=c(0, 15)) +
scale_colour_hue(guide = "none") + theme_bw()
p1 <- p +
geom_text(aes(label = ifelse(samplesize > 1000, samplesize, "")),
size = 10, color = "black", alpha = 0.6) +
geom_text(aes(label = ifelse(samplesize < 100, samplesize, "")),
size = 3, color = "black", alpha = 0.6) +
geom_text(aes(label = ifelse(samplesize > 100 & samplesize < 1000, samplesize, "")),
size = 6, color = "black", alpha = 0.6)
## Extracxt the legend from p1 using functions from the gridExtra package
g1 = ggplotGrob(p1)
leg1 = gtable_filter(g1, "guide-box")
## Keep p1 but dump its legend
p1 = p1 + theme(legend.position = "none")
## Get second legend - size of the label.
## Draw a dummy plot, using fsamplesize as a size aesthetic. Note that the label sizes are
## set to 3, 6, and 10, matching the sizes of the labels in p1.
dummy.plot = ggplot(data = ib, aes(x = city, y = category, label = samplesize)) +
geom_point(aes(size = fsamplesize), colour = NA) +
geom_text(show.legend = FALSE) + theme_bw() +
guides(size = guide_legend(override.aes = list(colour = "black", shape = utf8ToInt("N")))) +
scale_size_manual("Sample Size", values = c(3, 6, 10),
breaks = levels(ib$fsamplesize), labels = c("< 100", "100 - 1000", "> 1000"))
## Get the legend from dummy.plot using functions from the gridExtra package
g2 = ggplotGrob(dummy.plot)
leg2 = gtable_filter(g2, "guide-box")
## Arrange the three components (p1, leg1, leg2) using functions from the gridExtra package
## The two legends are arranged using the inner arrangeGrob function. The resulting
## chart is then arranged with p1 in the outer arrrangeGrob function.
ib.plot = arrangeGrob(p1, arrangeGrob(leg1, leg2, nrow = 2), ncol = 2,
widths = unit(c(9, 2), c("null", "null")))
## Draw the graph
grid.newpage()
grid.draw(ib.plot)
This actually doesn't directly address your question, but it is how I might go about creating a graph with the general characteristics you describe:
ib$ss <- paste("n = ",ib$samplesize,sep = "")
ggplot(data=ib, aes(x=city, y=category, size=median, colour=category, label=ss)) +
geom_point(alpha=.6) +
geom_text(size = 2, vjust = -1.2,colour="black") +
scale_colour_hue(legend = FALSE)
I removed the scale_area piece, as I'm not sure what purpose it served and it was causing errors for me.
So the rationale here is that the sample size information feels more like an annotation to me than something that deserves its own scale and legend. Opinions may differ on that, of course, but I thought I'd put it out there in case you find it useful.
This too doesn't answer your question. I've left samplesize inside the circle. Also, samplesize to me is more like an annotation than a legend.
But I think you are using an old version of ggplot2. There have been some changes in ggplot2 version 0.9.0. I've made the changes below.
p<-ggplot(data=ib, aes(x=city, y=category, size=median, colour=category, label=samplesize)) +
geom_point(alpha=.6) +
scale_area(range = c(1,15)) + # range instead of to
scale_colour_hue(guide = "none") + # guide instead of legend
geom_text(size = 2.5, colour="black")
p

Resources