ggplot Discrete Legend Continuous Data - r

I'm new to R programming but I'm enjoying the challenge of writing code!
I created a GIF by stitching multiple map plots together. Unfortunately,
my legend is referencing the particular year of the map being generated and as a result, the GIF shows a legend that has its marks moving up and down. I think the solution would be to have the legend reference the entire data-frame rather than the given year. How do I do this?
Link to the GIF:
https://1drv.ms/i/s!Ap-NxMqZOClHqgsFHSxo-kR1pLrr
##This is the R-Code I used for the year 1950:
kansas1950 <- readShapePoly("KansasCOUNTIES.shp")
## Kansas Winter-Wheat Planted from Quickstats
kansas1950.acres <- read.csv(file = "KWW 19502016 QuickStatsEst.csv",
stringsAsFactors = FALSE)
## Create a smaller dataset by retaining the kansas Acres in 1950 and the FIPS
## FIPS, which will be used for matching and merging with the input shapefile
smaller.data1950 <- data.frame(FIPS = kansas1950.acres$FIPS, Acres = kansas1950.acres$X1950)
smaller.data1950 <- na.omit(smaller.data1950)
## Join the two datasets using their common field
matched.indices1950 <- match(kansas1950#data[, "FIPS"], smaller.data1950[, "FIPS"])
kansas1950#data <- data.frame(kansas1950#data, smaller.data1950[matched.indices1950, ])
## Compute the cartogram transformation of each county using its population
## with the degree of Gaussian blur = 0.5
kansas1950.carto <- quick.carto(kansas1950, kansas1950#data$Acres, blur = 0.5)
## Convert the object into data frame
kansas1950.carto <- gBuffer(kansas1950.carto, byid=TRUE, width=0)
kansas1950.f <- fortify(kansas1950.carto, region = "FIPS")
## Merge the cartogram transformation with the kansas map shapefile
kansas1950.f <- merge(kansas1950.f, kansas1950#data, by.x = "id", by.y = "FIPS")
# Plot of the transformed polygons, where each county is
## further shaded by their acreage (lighter means bigger)
my_map1950 <- ggplot(kansas1950.f, aes(long, lat, group = group,
fill = kansas1950.f$Acres)) + geom_polygon() +
scale_fill_continuous(breaks = c(0, 10000, 100000, 200000, 526000),
labels = c("0 Acres","10k Acres", "100k Acres", "200k Acres", "526k Acres"),
low = "black",
high = "purple"
) +
labs(x=NULL, y=NULL) + labs(fill = "Acres Planted")
# Remove default ggplot layers
my_map1950 <-my_map1950 + theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.ticks=element_blank(),
axis.text.x=element_blank(),axis.text.y=element_blank(),
axis.line = element_line(colour = NA))
# Citation
my_map1950 <- my_map1950 + labs(caption = "USDA-NASS Quick Stats") + ggtitle("1950 Kansas Winter-Wheat Acres Planted")
my_map1950
# Save a higher resolution PNG
png('my_map1950kwwpurp.png', units="in", width=10, height=8, res=300)
my_map1950
dev.off()

Assuming this is what you want, try adding this to your plot (but, of course, specifying your own custom lower and upper limits):
+ scale_fill_gradient(limits = c(0, 10))
I have a sample df that worked:
df <- data.frame(x = 1:10)
p <- ggplot(df, aes(x, 1)) + geom_tile(aes(fill = x), colour = "white")
p + scale_fill_gradient(limits = c(0, 10))
p + scale_fill_gradient(limits = c(0, 20))
Here's the graph with the scale set from 0 to 10.
Here's the graph with the scale set from 0 to 20.
EDIT: Oh, I see now that you have called scale_fill_continuous() in your code. Try adding a limits argument similar to what I did to that.

Related

Historical Variance Error Decomposition plot in R

I found how to estimate the historical Variance Decomposition for VAR models in R in the below link
Historical Variance Error Decompotision Daniel Ryback
Daniel Ryback presents the result in an excel plot, but I wanted to prepare it with ggplot so I created some lines to get it, nevertheless, the plot I got in ggplot is very different to the one showed by Daniel in Excel. I replicated in excel and got the same result than Daniel so it seems there is an error in the way I am preparing the ggplot. Does anyone have a suggestion to arrive to the excel result?
See below my code
library(vars)
library(ggplot2)
library(reshape2)
this code is run after runing the code developed by Daniel Ryback in the link above to define the HD function
data(Canada)
ab<-VAR(Canada, p = 2, type = "both")
HD <- VARhd(Estimation=ab)
HD[,,1]
ex <- HD[,,1]
ex1 <- as.data.frame(ex) # transforming the HD matrix as data frame #
ex2 <- ex1[3:84,1:4] # taking our the first 2 rows as they are N/As #
colnames(ex2) <- c("Emplyment", "Productivity", "Real Wages", "Unemplyment") # renaming columns #
ex2$Period <- 1:nrow(ex2) # creating an id column #
col_id <- grep("Period", names(ex2)) # setting the new variable as id #
ex3 <- ex2[, c(col_id, (1:ncol(ex2))[-col_id])] # moving id variable to the first column #
molten.ex <- melt(ex3, id = "Period") # melting the data frame #
ggplot(molten.ex, aes(x = Period, y = value, fill = variable)) +
geom_bar(stat = "identity") +
guides(fill = guide_legend(reverse = TRUE))
ggplot version
Excel version
The difference is that ggplot2 is ordering the variable factor and plotting it in a different order than excel. If you reorder the factor before plotting it will put 'unemployment' at the bottom and 'employment' at the top, as in excel:
molten.ex$variable <- factor(molten.ex$variable, levels = c("Unemployment",
"Real Wages",
"Productivity",
"Employment"))
ggplot(molten.ex, aes(x = Period, y = value, fill = variable)) +
geom_bar(stat = "identity", width = 0.6) +
guides(fill = guide_legend(reverse = TRUE)) +
# Making the R plot look more like excel for comparison...
scale_y_continuous(limits = c(-6,8), breaks = seq(-6,8, by = 2)) +
scale_fill_manual(name = NULL,
values = c(Unemployment = "#FFc000", # yellow
`Real Wages` = "#A4A4A4", # grey
Productivity = "#EC7C30", # orange
Employment = "#5E99CE")) + # blue
theme(rect = element_blank(),
panel.grid.major.y = element_line(colour = "#DADADA"),
legend.position = "bottom",
axis.ticks = element_blank(),
axis.title = element_blank(),
legend.key.size = unit(3, "mm"))
Giving:
To roughly match the excel graph in Daniel Ryback's post:

ggplot2 map comes out blank with only title and legend R

I'am trying to make maps using ggplot2. Dot maps work fine but choropleth maps just come out blank.
library(broom)
library(rgdal)
library(ggplot2)
philly_df <- tidy(philly)
head(philly_df)
philly$polyID <- sapply(slot(philly, "polygons"), function(x) slot(x, "ID"))
philly_df <- merge(philly_df, philly, by.x = "id", by.y="polyID")
head(philly_df)
# because my polygon number grows from 72 to up to 6000 with repeated entries,
# use distinct to have them back to 72
distinct(philly_df, id, .keep_all = TRUE)
# Generate the data to plot
philly_df <- merge(philly_df, Death_U5, by.x = "DIS_NAME", by.y="District")
head(philly_df)
# when I plot the map, it gives error:
# "Error: Don't know how to add o to a plot"
# and shows only title and legend but no map
ggplot() +
geom_polygon(data = philly_df,
aes(x = long, y = lat, group = group, fill = cut_number(deathU5_Q1, 5))) +
scale_fill_brewer("deaths", palette = "OrRd") +
ggtitle("under 5 malaria deaths for Q1 Philly") +
theme(line = element_blank(),
axis.text=element_blank(),
axis.title=element_blank(),
panel.background = element_blank())
coord_equal()

How to create legend (continuous colourbar) based on specific range of values in R/ggplot2?

I have a dataframe named myKrige_new contains some latitude-longutude wise interpolated values. You can download from HERE. I have plotted this values on a particular area of a country map using ggplot2 package in R and I got this plot
But I want the legend(colourbar) of my plot would be like the following legend.
In my dataset here, the range of the data (pred) is 72 to 257. But I want my legend would show the value 0 to 200 because of comparing reason with other plot though there no value under 72 here .
So, I want to use 20 different colour like above legend that means last box of legend would contain colour regarding value greater than 200. I have used scale_fill_gradientn function but it didn't work. I have spend days to find some option to do it in R, didn't get success. Any kind of help will be highly appreciable.
R code :
library(scales)
library(ggplot2)
myKrige_new <- read.csv ("myKrige_new.csv")
range(myKrige_new$LON)
range(myKrige_new$LAT)
#Original skorea data transformed the same was as myKrige_new
skorea1 <- getData("GADM", country= "KOR", level=1)
skorea1 <- fortify(skorea1)
myKorea1 <- data.frame(skorea1)
###############
ggplot()+
theme_minimal() +
#SOLUTION 1:
#geom_tile(data = myKrige_new, aes(x= LON, y= LAT, fill = pred)) +
#SOLUTION 2: Uncomment the line(s) below:
#geom_point(data = myKrige_new, aes(x= LON, y= LAT, fill = pred),
#shape=22, size=8, colour=NA)+
#Solution 3
stat_summary_2d(data=myKrige_new, aes(x = LON, y = LAT, z = pred),bins = 30,
binwidth = c(0.05,0.05)) +
scale_fill_gradientn(colours=c("white","blue","green","yellow","red"),
values=rescale(c(0,50,100,150,200)),
guide="colorbar", name = "PM10 Conc")+
geom_map(data= myKorea1, map= myKorea1, aes(x=long,y=lat,map_id=id,group=group),
fill=NA, colour="black") +
coord_cartesian(xlim= c(126.6, 127.2), ylim= c(37.2 ,37.7)) +
labs(title= "PM10 Concentration in Seoul Area at South Korea",
x="", y= "")+
theme(legend.position = "bottom")+
guides(fill = guide_colourbar(barwidth = 27, barheight = NULL,
title.position = "bottom", title.hjust = 0.5))
Here is a working solution:
library(scales)
library(ggplot2)
library(raster) # needed for the `getData` function
library(dplyr) # needed for the `mutate` funtion
myKrige_new <- read.csv("~/Downloads/myKrige_new.csv")[-1]
range(myKrige_new$LON)
range(myKrige_new$LAT)
# Original skorea data transformed the same was as myKrige_new
skorea1 <- getData("GADM", country= "KOR", level=1)
skorea1 <- fortify(skorea1)
myKorea1 <- data.frame(skorea1)
# the range of pred goes above 200 (max = 257)
summary(myKrige_new$pred)
ggplot() +
theme_minimal() +
stat_summary_2d(data = mutate(myKrige_new,
pred = ifelse(pred > 200, 200, pred)),
aes(x = LON, y = LAT, z = pred),
bins = 30,
binwidth = c(0.05,0.05)) +
scale_fill_gradientn(colours=c("white","blue","green","yellow","red"),
values=rescale(c(0,50,100,150,200)),
name = expression(paste(PM[10], group("[",paste(mu,g/m^3), "]"))),
limits = c(0,200),
breaks = seq(0,200, 20),
guide = guide_colorbar(nbin = 20,
barwidth = 27,
title.position = "bottom",
title.hjust = 0.5,
raster = FALSE,
ticks = FALSE)) +
geom_map(data= myKorea1,
map= myKorea1,
aes(x=long,y=lat,map_id=id,group=group),
fill=NA,
colour="black") +
coord_equal(xlim= c(126.6, 127.2),
ylim= c(37.2 ,37.7)) +
scale_y_continuous(expand = c(0,0)) +
scale_x_continuous(expand = c(0,0)) +
labs(title = "PM10 Concentration in Seoul Area at South Korea",
x = "",
y = "") +
theme(legend.position = "bottom")
I added limits = c(0,200) and breaks = seq(0, 200, 20) to scale_fill_gradientn as well as nbin = 20 to guide_colorbar, this last change is optional because the default nbin is 20, but in your case you actually need 20. Also, adding limits means values outside the range are plotted in grey50 so I had to transform pred values above 200 to 200 to avoid that; the interpretation of red color is now 200+.
One more thing, the option raster in guide_colorbar changes the colorbar from a raster object to a set of rectangles achieving the look you were looking for.
Finally, I changed the coordinate system from cartesian to equal because you are plotting a map.
Here is the result hope it helps:
Update: added a expand argument to scale_y_continuous and scale_x_continuous as requested by OP

How to control width of multiple plots in ggplot2?

Map Data: InputSpatialData
Yield Data: InputYieldData
Results_using viewport():
EDIT: Results using "multiplot" function as suggested by #rawr (see comment below). I do love the new results, especially that the map is bigger. Nonetheless, the boxplot seems misaligned with the map plot still. Is there a more systematic way to control for centering and placement?
My Question: Is there a way to control for the size of the boxplot plot to make it close in size and centered with the map plot above it?
FullCode:
## Loading packages
library(rgdal)
library(plyr)
library(maps)
library(maptools)
library(mapdata)
library(ggplot2)
library(RColorBrewer)
library(foreign)
library(sp)
library(ggsubplot)
library(reshape)
library(gridExtra)
## get.centroids: function to extract polygon ID and centroid from shapefile
get.centroids = function(x){
poly = wmap#polygons[[x]]
ID = poly#ID
centroid = as.numeric(poly#labpt)
return(c(id=ID, long=centroid[1], lat=centroid[2]))
}
## read input files (shapefile and .csv file)
wmap <- readOGR(dsn=".", layer="ne_110m_admin_0_countries")
wyield <- read.csv(file = "F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/GTAP-CGE/GTAP_Sims&Rests/NewFiles/RMaps_GTAP/AllWorldCountries_CCShocksGTAP.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)
wyield$ID_1 <- substr(wyield$ID_1,3,10) # Eliminate the ID_1 column
## re-order the shapefile
wyield <- cbind(id=rownames(wmap#data),wyield)
## Build table of labels for annotation (legend).
labs <- do.call(rbind,lapply(1:17,get.centroids)) # Call the polygon ID and centroid from shapefile
labs <- merge(labs,wyield[,c("id","ID_1","name_long")],by="id") # merging the "labs" data with the spatial data
labs[,2:3] <- sapply(labs[,2:3],function(x){as.numeric(as.character(x))})
labs$sort <- as.numeric(as.character(labs$ID_1))
labs <- cbind(labs, name_code = paste(as.character(labs[,4]), as.character(labs[,5])))
labs <- labs[order(labs$sort),]
## Dataframe for boxplot plot
boxplot.df <- wyield[c("ID_1","name_long","A1B","A1BLow","A1F","A1T","A2","B1","B1Low","B2")]
boxplot.df[1] <- sapply(boxplot.df[1], as.numeric)
boxplot.df <- boxplot.df[order(boxplot.df$ID_1),]
boxplot.df <- cbind(boxplot.df, name_code = paste(as.character(boxplot.df[,1]), as.character(boxplot.df[,2])))
boxplot.df <- melt(boxplot.df, id=c("ID_1","name_long","name_code"))
boxplot.df <- transform(boxplot.df,name_code=factor(name_code,levels=unique(name_code)))
## Define new theme for map
## I have found this function on the website
theme_map <- function (base_size = 14, base_family = "serif") {
# Select a predefined theme for tweaking features
theme_bw(base_size = base_size, base_family = base_family) %+replace%
theme(
axis.line=element_blank(),
axis.text.x=element_text(size=rel(1.2), color="grey"),
axis.text.y=element_text(size=rel(1.2), color="grey"),
axis.ticks=element_blank(),
axis.ticks.length=unit(0.3, "lines"),
axis.ticks.margin=unit(0.5, "lines"),
axis.title.x=element_text(size=rel(1.2), color="grey"),
axis.title.y=element_text(size=rel(1.2), color="grey"),
legend.background=element_rect(fill="white", colour=NA),
legend.key=element_rect(colour="white"),
legend.key.size=unit(1.3, "lines"),
legend.position="right",
legend.text=element_text(size=rel(1.3)),
legend.title=element_text(size=rel(1.4), face="bold", hjust=0),
panel.border=element_blank(),
panel.grid.minor=element_blank(),
plot.title=element_text(size=rel(1.8), face="bold", hjust=0.5, vjust=2),
plot.margin=unit(c(0.5,0.5,0.5,0.5), "lines")
)}
## Transform shapefile to dataframe and merge with yield data
wmap_df <- fortify(wmap)
wmap_df <- merge(wmap_df,wyield, by="id") # merge the spatial data and the yield data
## Plot map
mapy <- ggplot(wmap_df, aes(long,lat, group=group))
mapy <- mapy + geom_polygon(aes(fill=AVG))
mapy <- mapy + geom_path(data=wmap_df, aes(long,lat, group=group, fill=A1BLow), color="white", size=0.4)
mapy <- mapy + labs(title="Average yield impacts (in %) across SRES scenarios ") + scale_fill_gradient2(name = "%Change in yield",low = "red3",mid = "snow2",high = "darkgreen")
mapy <- mapy + coord_equal() + theme_map()
mapy <- mapy + geom_text(data=labs, aes(x=long, y=lat, label=ID_1, group=ID_1), size=6, family="serif")
mapy
## Plot boxplot
boxploty <- ggplot(data=boxplot.df, aes(factor(name_code),value)) +
geom_boxplot(stat="boxplot",
position="dodge",
fill="grey",
outlier.colour = "blue",
outlier.shape = 16,
outlier.size = 4) +
labs(title="Distribution of yield impacts (in %) by GTAP region", y="Yield (% Change)") + theme_bw() + coord_flip() +
stat_summary(fun.y = "mean", geom = "point", shape=21, size= 4, color= "red") +
theme(plot.title = element_text(size = 26,
hjust = 0.5,
vjust = 1,
face = 'bold',
family="serif")) +
theme(axis.text.x = element_text(colour = 'black',
size = 18,
hjust = 0.5,
vjust = 1,
family="serif"),
axis.title.x = element_text(size = 14,
hjust = 0.5,
vjust = 0,
face = 'bold',
family="serif")) +
theme(axis.text.y = element_text(colour = 'black',
size = 18,
hjust = 0,
vjust = 0.5,
family="serif"),
axis.title.y = element_blank())
boxploty
## I found this code on the website, and tried to tweak it to achieve my desired
result, but failed
# Plot objects using widths and height and respect to fix aspect ratios
grid.newpage()
pushViewport( viewport( layout = grid.layout( 2 , 1 , widths = unit( c( 1 ) , "npc" ) ,
heights = unit( c( 0.45 ) , "npc" ) ,
respect = matrix(rep(2,1),2) ) ) )
print( mapy, vp = viewport( layout.pos.row = 1, layout.pos.col = 1 ) )
print( boxploty, vp = viewport( layout.pos.row = 2, layout.pos.col = 1 ) )
upViewport(0)
vp3 <- viewport( width = unit(0.5,"npc") , x = 0.9 , y = 0.5)
pushViewport(vp3)
#grid.draw( legend )
popViewport()
Is this close to what you had in mind?
Code:
library(rgdal)
library(ggplot2)
library(RColorBrewer)
library(reshape)
library(gridExtra)
setwd("<directory with all your files...>")
get.centroids = function(x){ # extract centroids from polygon with given ID
poly = wmap#polygons[[x]]
ID = poly#ID
centroid = as.numeric(poly#labpt)
return(c(id=ID, c.long=centroid[1], c.lat=centroid[2]))
}
wmap <- readOGR(dsn=".", layer="ne_110m_admin_0_countries")
wyield <- read.csv(file = "AllWorldCountries_CCShocksGTAP.csv", header=TRUE)
wyield <- transform(wyield, ID_1 = substr(ID_1,3,10)) #strip leading "TR"
# wmap#data and wyield have common, unique field: name
wdata <- data.frame(id=rownames(wmap#data),name=wmap#data$name)
wdata <- merge(wdata,wyield, by="name")
labs <- do.call(rbind,lapply(1:17,get.centroids)) # extract polygon IDs and centroids from shapefile
wdata <- merge(wdata,labs,by="id")
wdata[c("c.lat","c.long")] <- sapply(wdata[c("c.lat","c.long")],function(x) as.numeric(as.character(x)))
wmap.df <- fortify(wmap) # data frame for world map
wmap.df <- merge(wmap.df,wdata,by="id") # merge data to fill polygons
palette <- brewer.pal(11,"Spectral") # ColorBrewewr.org spectral palette, 11 colors
ggmap <- ggplot(wmap.df, aes(x=long, y=lat, group=group))
ggmap <- ggmap + geom_polygon(aes(fill=AVG))
ggmap <- ggmap + geom_path(colour="grey50", size=.1)
ggmap <- ggmap + geom_text(aes(x=c.long, y=c.lat, label=ID_1),size=3)
ggmap <- ggmap + scale_fill_gradientn(name="% Change",colours=rev(palette))
ggmap <- ggmap + theme(plot.title=element_text(face="bold"),legend.position="left")
ggmap <- ggmap + coord_fixed()
ggmap <- ggmap + labs(x="",y="",title="Average Yield Impacts across SRES Scenarios (% Change)")
ggmap <- ggmap + theme(plot.margin=unit(c(0,0.03,0,0.05),units="npc"))
ggmap
box.df <- wdata[order(as.numeric(wdata$ID_1)),] # order by ID_1
box.df$label <- with(box.df, paste0(name_long," [",ID_1,"]")) # create labels for boxplot
box.df <- melt(box.df,id.vars="label",measure.vars=c("A1B","A1BLow","A1F","A1T","A2","B1","B1Low","B2"))
box.df$label <- factor(box.df$label,levels=unique(box.df$label)) # need this so orderin is maintained in ggplot
ggbox <- ggplot(box.df,aes(x=label, y=value))
ggbox <- ggbox + geom_boxplot(fill="grey", outlier.colour = "blue", outlier.shape = 16, outlier.size = 4)
ggbox <- ggbox + stat_summary(fun.y=mean, geom="point", shape=21, size= 4, color= "red")
ggbox <- ggbox + coord_flip()
ggbox <- ggbox + labs(x="", y="% Change", title="Distribution of Yield Impacts by GTAP region")
ggbox <- ggbox + theme(plot.title=element_text(face="bold"), axis.text=element_text(color="black"))
ggbox <- ggbox + theme(plot.margin=unit(c(0,0.03,0,0.0),units="npc"))
ggbox
grid.newpage()
pushViewport(viewport(layout=grid.layout(2,1,heights=c(0.40,0.60))))
print(ggmap, vp=viewport(layout.pos.row=1,layout.pos.col=1))
print(ggbox, vp=viewport(layout.pos.row=2,layout.pos.col=1))
Explanation:
The last 4 lines of code do most of the work in arranging the layout. I create a viewport layout with 2 viewports arranged as 2 rows in 1 column. The upper viewport is 40% of the height of the grid, the lower viewport is 60% of the height. Then, in the ggplot calls I create a right margin of 3% of the plot width for both the map and he boxplot, and a left margin for the map so that the map and the boxplot are aligned on the left. There's a fair amount of tweaking to get everything lined up, but these are the parameters to play with. You should also know that, since we use coord_fixed() in the map, if you change the overall size of the plot (by resizing the plot window, for example), the map's width will change..
Finally, your code to create the choropleth map is a little dicey...
## re-order the shapefile
wyield <- cbind(id=rownames(wmap#data),wyield)
This does not reorder the shapefile. All you are doing here is prepending the wmap#data rownames to your wyield data. This works if the rows in wyield are in the same order as the polygons in wmap - a very dangerous assumption. If they are not, then you will get a map, but the coloring will be incorrect and unless you study the output very carefully, it is likely to be missed. So the code above creates an association between polygon ID and region name, merges the wyield data based on name, and then merges that into wmp.df based on polygon id.
wdata <- data.frame(id=rownames(wmap#data),name=wmap#data$name)
wdata <- merge(wdata,wyield, by="name")
...
wmap.df <- fortify(wmap) # data frame for world map
wmap.df <- merge(wmap.df,wdata,by="id") # merge data to fill polygons

Histogram of stacked boxes in ggplot2

The graph I'm currently trying to make falls a little between two stools. I want to make a histogram that is composed of stacked and labelled boxes. Here's an example of exactly the sort of thing I'm talking about, taken from a recent article in the New York Times:
http://farm8.staticflickr.com/7109/7026409819_1d2aaacd0a.jpg
Is it possible to achieve this using ggplot2?
To amplify the question somewhat, so far what I have is:
dfr <- data.frame(
name = LETTERS[1:26],
percent = rnorm(26, mean=15)
)
ggplot(dfr, aes(x=percent, fill=name)) + geom_bar() +
stat_bin(geom="text", aes(label=name))
...which I'm clearly doing all wrong. Ultimately what I'd ideally like is something along the lines of the manually-modified graph below, with (say) letters A to M filled one shade and N to Z filled another.
http://farm8.staticflickr.com/7116/7026536711_4df9a1aa12.jpg
Here you go!
set.seed(3421)
# added type to mimick which candidate is supported
dfr <- data.frame(
name = LETTERS[1:26],
percent = rnorm(26, mean=15),
type = sample(c("A", "B"), 26, replace = TRUE)
)
# easier to prepare data in advance. uses two ideas
# 1. calculate histogram bins (quite flexible)
# 2. calculate frequencies and label positions
dfr <- transform(dfr, perc_bin = cut(percent, 5))
dfr <- ddply(dfr, .(perc_bin), mutate,
freq = length(name), pos = cumsum(freq) - 0.5*freq)
# start plotting. key steps are
# 1. plot bars, filled by type and grouped by name
# 2. plot labels using name at position pos
# 3. get rid of grid, border, background, y axis text and lables
ggplot(dfr, aes(x = perc_bin)) +
geom_bar(aes(y = freq, group = name, fill = type), colour = 'gray',
show_guide = F) +
geom_text(aes(y = pos, label = name), colour = 'white') +
scale_fill_manual(values = c('red', 'orange')) +
theme_bw() + xlab("") + ylab("") +
opts(panel.grid.major = theme_blank(), panel.grid.minor = theme_blank(),
axis.ticks = theme_blank(), panel.border = theme_blank(),
axis.text.y = theme_blank())

Resources