R: Counting how many polygons between two - r

I was trying to recreate a map showing how many municipals are you away from Cracow:
and to change the city from Cracow to Wrocław. The map was done in GIMP.
I got a shapefile (available here: http://www.gis-support.pl/downloads/powiaty.zip). I read the shapefile documentation packages like maptools, rgdal or sf, but I couldn't find an automatic function to count it, because I wouldn't like to do that manually.
Is there a function to do that?
Credits: The map was done by Hubert Szotek on https://www.facebook.com/groups/mapawka/permalink/1850973851886654/

I am not that experienced at network analysis, so I must confess not to understand every single line of code as follows. But it works! A lot of the material was adapted from here: https://cran.r-project.org/web/packages/spdep/vignettes/nb_igraph.html
This is the final results:
Code
# Load packages
library(raster) # loads shapefile
library(igraph) # build network
library(spdep) # builds network
library(RColorBrewer) # for plot colour palette
library(ggplot2) # plots results
# Load Data
powiaty <- shapefile("powiaty/powiaty")
Firstly the poly2nb function is used to calculate neighbouring regions:
# Find neighbouring areas
nb_q <- poly2nb(powiaty)
This creates our spatial mesh, which we can see here:
# Plot original results
coords <- coordinates(powiaty)
plot(powiaty)
plot(nb_q, coords, col="grey", add = TRUE)
This is the bit where I am not 100% sure what is happening. Basically, it is working out the shortest distance between all the shapefiles in the network, and returns a matrix of these pairs.
# Sparse matrix
nb_B <- nb2listw(nb_q, style="B", zero.policy=TRUE)
B <- as(nb_B, "symmetricMatrix")
# Calculate shortest distance
g1 <- graph.adjacency(B, mode="undirected")
dg1 <- diameter(g1)
sp_mat <- shortest.paths(g1)
Having made the calculations, the data can now be formatted to get into plotting format, so the shortest path matrix is merged with the spatial dataframe.
I wasn't sure what would be best to use as the ID for referring to datasets so I chose the jpt_kod_je variable.
# Name used to identify data
referenceCol <- powiaty$jpt_kod_je
# Rename spatial matrix
sp_mat2 <- as.data.frame(sp_mat)
sp_mat2$id <- rownames(powiaty#data)
names(sp_mat2) <- paste0("Ref", referenceCol)
# Add distance to shapefile data
powiaty#data <- cbind(powiaty#data, sp_mat2)
powiaty#data$id <- rownames(powiaty#data)
The data is now in a suitable format to display. Using the basic function spplot we can get a graph quite quickly:
displaylayer <- "Ref1261" # id for Krakow
# Plot the results as a basic spplot
spplot(powiaty, displaylayer)
I prefer ggplot for plotting more complex graphs as you can control the styling easier. However it is a bit more picky about how the data is fed into it, so we need to reformat the data for it before we build the graph:
# Or if you want to do it in ggplot
filtered <- data.frame(id = sp_mat2[,ncol(sp_mat2)], dist = sp_mat2[[displaylayer]])
ggplot_powiaty$dist == 0
ggplot_powiaty <- powiaty %>% fortify()
ggplot_powiaty <- merge(x = ggplot_powiaty, y = filtered, by = "id")
names(ggplot_powiaty)
And the plot. I have customised it a bit by removing elements which aren't required and added a background. Also, to make the region at the centre of the search black, I subset the data using ggplot_powiaty[ggplot_powiaty$dist == 0, ], and then plot this as another polygon.
ggplot(ggplot_powiaty, aes(x = long, y = lat, group = group, fill = dist)) +
geom_polygon(colour = "black") +
geom_polygon(data =ggplot_powiaty[ggplot_powiaty$dist == 0, ],
fill = "grey60") +
labs(title = "Distance of Counties from Krakow", caption = "Mikey Harper") +
scale_fill_gradient2(low = "#d73027", mid = "#fee08b", high = "#1a9850", midpoint = 10) +
theme(
axis.line = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
plot.background = element_rect(fill = "#f5f5f2", color = NA),
panel.background = element_rect(fill = "#f5f5f2", color = NA),
legend.background = element_rect(fill = "#f5f5f2", color = NA),
panel.border = element_blank())
To plot for Wrocław as shown at the top of the post, just change displaylayer <- "Ref0264" and update the title.

Related

ggplot - using a vector mask for a raster

I am trying to create a vector mask to a raster. The raster is some color gradient created elswhere. Here I am discussing only the vector mask.
Using the raster and sf packages seems to be an overkill for the simple case. The best way I came up with is to plot the vector object, ggsave it to a raster file, read it back and then overlay it on the original raster.
will be happy to hear any better suggestion.
Anyway, when I write the plot to the the file there is always a small frame around it. It may not be visible when displaying the file on screen but its problematic in my case.
I could remove the frame but I cannot rely on color only and I am not sure that its always the same size. Here is my exampe:
library(tidyverse)
library(reshape2)
library(bmp)
pol <- tibble(x = c(1, 3, 5, 4), y = c(3,5, 4, 1))
p <- ggplot(pol) +
geom_polygon(aes(x,y), fill = "red") +
theme(panel.background = element_rect(fill = "black"),
panel.grid = element_blank(),
axis.title = element_blank(),
axis.ticks = element_blank(),
axis.text = element_blank())
ggsave("pol.bmp", p, dpi = "screen")
bmp <- read.bmp("pol.bmp")
bmp <- melt(bmp, varnames = c("y", "x")) %>%
mutate(value = as.factor(value))
ggplot(bmp) +
geom_raster(aes(x,y, fill = value)) +
theme(legend.position="none")
The initial plot
The raterized plot (ignore colors)
Please advise

ggVennDiagram, define decimal places

does someone have an idea how to change the number of displayed decimal places when using the ggVennDiagram function (R)?
# Example code
install.packages("ggVennDiagram")
my_species <- paste("species", 1:50, sep="")
set.seed(2)
x <-list(A = sample(my_species, 12),
B = sample(my_species, 20),
C = sample(my_species, 16),
D = sample(my_species, 2))
# venn plot
ggVennDiagram(x, label="percent") +
theme(axis.text = element_blank(),
legend.position = "none",
axis.ticks = element_blank(),
axis.title = element_blank())
From the source code (https://www.rdocumentation.org/packages/ggVennDiagram/versions/0.3/source), I see that the author defined two decimal places for label="percent".
Can I overwrite this in my R code, so that I have either no decimal places or just one?
# from function "plot_venn()" in source code of ggVennDiagram
counts <- counts %>%
mutate(percent=paste(round(.data$count*100/sum(.data$count),digits = 2),"%",sep=""))
Thank you very much in advance!
There is always a solution, but since this is hard coded way down, its going to get ugly.
In this case one way to do it is to initialise the figure without percentages, then add them yourself like ggVennDiagram would have, which requires a bit of backtracking through code and reaching into its innards.
g <- ggVennDiagram(x, label=NULL) +
theme(axis.text = element_blank(),
legend.position = "none",
axis.ticks = element_blank(),
axis.title = element_blank())
g
## Notice label=NULL above. We add labels ourself like so:
region_data <- ggVennDiagram:::four_dimension_ellipse_regions(n.sides=3000)
counts <- ggVennDiagram:::four_dimension_region_values(x)
polygon <- region_data[[1]]
center <- region_data[[2]]
counts <- counts %>%
mutate(percent=paste(round(.data$count*100/sum(.data$count),digits = 1),"%",sep="")) %>%
mutate(label = paste(.data$count,"\n","(",.data$percent,")",sep=""))
data <- merge(counts,center)
g + geom_label(aes_string(label="percent"),data=data,label.size = NA, alpha=.5)
(note, the code above was just copied from the package itself, the work goes into reverse engineering and figuring out which bits you need, and in which order)
You should notify the author of the package of this need, and ask him to offer this as a function argument.
ggVennDiagram now support percent_digit configuration in version 1.1. You may update it and set percent_digit as followings:
ggVennDiagram(x, label_percent_digit = 1, label = "percent")
see https://venn.bio-spring.info/using-ggvenndiagram#setting-region-label for more information.

Edit labels in tooltip for plotly maps using ggplot2 in r

I know this question has been asked a number of times but I think some of the underlying syntax for plotly has changed since those questions have been asked. Using ggplotly() to create a choropleth map gives the default tooltip of long, lat, group, and one of my variables from my aesthetics. I understand that tooltip maps only whats in the aesthetics. All I want to do is to customize the tooltip so it displays some of the variables in my dataset (including those not mapped to aesthetics) and not others (such as the coordinates). Below is a reproducible example and what I've tried so far. I followed the advice given in response to other questions to no avail.
#Load dependencies
library(rgeos)
library(stringr)
library(rgdal)
library(maptools)
library(ggplot2)
library(plotly)
#Function to read shapefile from website
dlshape=function(shploc, shpfile) {
temp=tempfile()
download.file(shploc, temp)
unzip(temp)
shp.data <- sapply(".", function(f) {
fp <- file.path(temp, f)
return(readOGR(".",shpfile))
})
}
austria <- dlshape(shploc="http://biogeo.ucdavis.edu/data/gadm2.8/shp/AUT_adm_shp.zip",
"AUT_adm1")[[1]]
#Create random data to add as variables
austria#data$example1<-sample(seq(from = 1, to = 100, by = 1), size = 11, replace = TRUE)
austria#data$example2<-sample(seq(from = 1, to = 100, by = 1), size = 11, replace = TRUE)
austria#data$example3<-sample(seq(from = 1, to = 100, by = 1), size = 11, replace = TRUE)
#Fortify shapefile to use w/ ggplot
austria.ft <- fortify(austria, region="ID_1")
data<-merge(austria.ft, austria, region="id", by.x = "id", by.y = "ID_1")
#Save as ggplot object
gg<-ggplot(data, aes(x = long, y = lat, fill = example1, group = group)) +
geom_polygon() + geom_path(color="black",linetype=1) +
coord_equal() +
scale_fill_gradient(low = "lightgrey", high = "darkred", name='Index') +xlab("")+ylab("") +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks = element_blank()) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black")) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"))
#Plot using ggplotly
ggplotly(gg)
From here I've tried two different approaches. The most successful one of the approaches gets me there in part. I can add new variables to to the tooltip but I cannot do two things: 1) I cannot get rid of other variables already displayed by default (from the aesthetics) and 2) I cannot rename the variables something other than their column name from the dataset (for example I would like to label "example3 as "Example III"). Here is that approach:
#Save as a new ggplot object except this time add ``label = example3`` to the aesthetics
gg2<-ggplot(data, aes(x = long, y = lat, fill = example1, group = group, label = example3)) +
geom_polygon() + geom_path(color="black",linetype=1) +
coord_equal() +
scale_fill_gradient(low = "lightgrey", high = "darkred", name='Index') +xlab("")+ylab("") +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks = element_blank()) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black")) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"))
#Save as plotly object then plot
gg2 <- plotly_build(gg2)
gg2
I also tried adding the following but it did nothing:
gg2$data[[1]]$text <- paste("Example I:", data$example1, "<br>",
"Example II:", data$example2, "<br>",
"Example III:", data$example3)
Any help is much appreciated!
UPDATE: I updated plotly by installing from github instead of CRAN. Using this updated version (4.0.0) I've made it apart of the way there.
gg2$x$data[[2]]$text <- paste("Example I:", data$example1, "<br>",
"Example II:", data$example2, "<br>",
"Example III:", data$example3)
gg2
What happens now simply baffles me. This adds an additional tooltip separate from the previous one. This new tooltip is exactly what I want however both of them appear -not at once but if I move my mouse around. See the two screenshots below:
Notice those tooltips are from the same unit (Tirol). Could this be a bug in the package? This does not occur when display other graphs such as a time-series instead of a map. Also note, that I assigned the label "Example I" (or II or III) and this does not show on the new tooltip I added.
UPDATE #2: I figured out that the old tooltip (with long and lat shown) only appears when hovering over the borders so I got rid of the geom_path(color="black",linetype=1) command (as to remove the borders) and now I've managed to successfully solve that problem. However, I'm still unable to modify the labels that appear in the tooltip.
UPDATE #3: I figured out how to edit the labels but FOR ONLY ONE VARIABLE. Which is nuts! Here's my workflow from start to finish:
#Load dependencies
library(rgeos)
library(stringr)
library(rgdal)
library(maptools)
library(ggplot2)
library(plotly)
#Function to read shapefile from website
dlshape=function(shploc, shpfile) {
temp=tempfile()
download.file(shploc, temp)
unzip(temp)
shp.data <- sapply(".", function(f) {
fp <- file.path(temp, f)
return(readOGR(".",shpfile))
})
}
austria <- dlshape(shploc="http://biogeo.ucdavis.edu/data/gadm2.8/shp/AUT_adm_shp.zip",
"AUT_adm1")[[1]]
#Create random data to add as variables
austria#data$example1<-sample(seq(from = 1, to = 100, by = 1), size = 11, replace = TRUE)
austria#data$example2<-sample(seq(from = 1, to = 100, by = 1), size = 11, replace = TRUE)
austria#data$example3<-sample(seq(from = 1, to = 100, by = 1), size = 11, replace = TRUE)
#Fortify shapefile to use w/ ggplot
austria.ft <- fortify(austria, region="ID_1")
data<-merge(austria.ft, austria, region="id", by.x = "id", by.y = "ID_1")
#Save as ggplot object
gg<-ggplot(data, aes(x = long, y = lat, fill = example1, group = group, text = paste("Province:", NAME_1))) +
geom_polygon(color="black", size=0.2) +
coord_equal() +
scale_fill_gradient(low = "lightgrey", high = "darkred", name='Index') +xlab("")+ylab("") +
theme(axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks = element_blank()) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black")) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"))
gg <- plotly_build(gg)
gg
That produces the following plot:
Notice that "Province" is now capitalized (it was not before). The trick was adding text = paste("Province:", NAME_1) to the aesthetics. HOWEVER, when I tried to add additional label changes using text2=paste("Example III:", example1), the following occurs:
Notice that it cannot render text2 the same way it renders text1. So instead I simply try adding a duplicate without the text2 like in the following: text=paste("Example III:", example1) -which produces the following odd result:
I'm beginning to think something as simple as toggling "legend" options in plotly's ggplot conversion is impossible.
UPDATE #4: So I decided to approach this another way. Instead, I decided to change the variable names themselves. I would have done this from the start, except I wasn't sure if/how ggplot2 accepts variables with spaces -i figured out `variable` that can work. So I went ahead and relabeled the variables. It works -KINDA. The problem is the text appears with the quotations marks around them. Now I need a way to get rid of these!!! Any ideas anyone? Thanks! Here is an image of what I mean by quotations in the text:
I am new to plotly too but have come across a similar problem for my ggplot2 bubble plots when using ggplotly(). I have finally found a solution that works for me and thought it might help you, too, although I haven't tried it for choropleth maps.
Your first question was to customize the tooltip so it displays some of the variables in the dataset (including those not mapped to aesthetics).
In your UPDATE#3 you introduce:text = paste("Province:", NAME_1) into your aes. If you want to add a second line of custom variables or text, just keep adding it into the brackets:text = paste("Province:", NAME_1, "Example III:", example1) To add a line break between both add <br> in the spot where you want the break to be, like:text = paste("Province:", NAME_1, "<br>", "Example III:", example1)
Your second question was to customize the tooltip so it does NOT display other (default) variables (that are mapped to aesthetics, such as the coordinates).
I found this very easy addition to the ggplotly() function that did the trick for me: ggplotly(gg, tooltip = c("text")) In my case, this removed ALL default variables that are shown in the tooltip and only showed those that are custom specified with text above. You can add other variables back in by doing ggplotly(gg, tooltip = c("text","x")) The order of the variables shown in the tooltip will be the same as the order specified in the tooltip argument. I found this documented here: https://github.com/ropensci/plotly/blob/master/R/ggplotly.R
This solution worked (in principle) for me using R 3.1.1 and plotly 3.4.13

ggplot2 facet plot of shapefile polygons produces strange lines

I'm working to produce a facet/lattice plot of choropleth maps that each show a how different model runs affect one variable being mapped across a number of polygons. The problem is that the output graphic produces strange lines that run between the polygons in each plot (see the graphic below).
While I've manipulated and converted the shapefile into a data frame with appropriate attributes for ggplot2, I'm not familiar with the details of how to use the package and the online documentation is limited for such a complex package. I'm not sure what parameter is causing this issue, but I suspect it may be the aes parameter.
The script:
library(rgdal, tidyr, maptools, ggplot2, dplyr, reshape2)
setwd('D:/path/to/wd')
waterloo <- read.table("waterloo-data.txt", header=TRUE, sep=',', stringsAsFactors=FALSE)
waterloo <- data.frame(waterloo$DAUID, waterloo$LA0km, waterloo$LA4_exp, waterloo$LA20km, waterloo$LA30km, waterloo$LA40km, waterloo$LA50km)
colnames(waterloo) <- c("DAUID", "LA0km", "LA10km","LA20km", "LA30km", "LA40km", "LA50km")
## Produces expenditure measurements by ID variable DAUID, using reshape2/melt
wtidy <- melt(waterloo, id.vars=c("DAUID"), measure.vars = c("LA0km", "LA10km", "LA20km", "LA30km", "LA40km", "LA50km"))
colnames(wtidy) <- c("DAUID", "BufferSize", "Expenditure")
wtidy$DAUID <- as.factor(wtidy$DAUID) # for subsequent join with wtrl_f
### READ SPATIAL DATA ###
#wtrl <- readOGR(".", "Waterloo_DA_2011_new")
wtrl <- readShapeSpatial("Waterloo_DA_2011_new")
wtrl$id <- row.names(wtrl)
wtrl_f <- fortify(wtrl)
wtrl_f <- left_join(wtrl_f, wtrl#data, by="id")
# Join wtrl fortified (wtrl_f) to either twaterloo or wtidy
wtrl_f <- left_join(wtrl_f, wtidy, by="DAUID")
### PLOT SPATIAL DATA ###
ggplot(data = wtrl_f, # the input data
aes(x = long.x, y = lat.x, fill = Variable/1000, group = BufferSize)) + # define variables
geom_polygon() + # plot the DAs
geom_path(colour="black", lwd=0.05) + # polygon borders
coord_equal() + # fixed x and y scales
facet_wrap(~ BufferSize, ncol = 2) + # one plot per buffer size
scale_fill_gradient2(low = "green", mid = "grey", high = "red", # colors
midpoint = 10000, name = "Variable\n(thousands)") + # legend options
theme(axis.text = element_blank(), # change the theme options
axis.title = element_blank(), # remove axis titles
axis.ticks = element_blank()) # remove axis ticks
The output graphic appears as follows:
Strange! I've made good progress but I don't know where ggplot is getting these lines. Any help on this would be appreciated!
PS; as an additional unrelated question, the polygon lines are rather jagged. How would I smooth these lines?
This answer helped me to solve my problem, but not before I made up this minimal example ready to post. I'm sharing it here in case it helps someone solve the same problem faster.
Problem:
I'm trying to make a basic map in R with ggplot2. The polygons are filling wrong, making extra lines.
library("ggplot2")
library("maps")
map <- ggplot(map_data("world", region = "UK"), aes(x = long, y = lat)) + geom_polygon()
map
wrong map image
Solution:
I have to set the aesthetic "group" parameter to put the polygon points in the right order, otherwise ggplot will try to plot a patch of Scotland coastline in the middle of the south coast (for example).
map <- ggplot(map_data("world", region = "UK"), aes(x = long, y = lat, group = group)) + geom_polygon()
map
OK, I managed to resolve this issue by changing the aesthetic group parameter found on page 11 of the ggplot2 manual:
http://cran.r-project.org/web/packages/ggplot2/ggplot2.pdf
The correct parameter is "group" and not the factor that is used to group the plots. The correct ggplot code:
ggplot(data = wtrl_f, # the input data
aes(x = long.x, y = lat.x, fill = Expenditure/1000, group = group)) + # define variables
geom_polygon() + # plot the DAs
geom_path(colour="black", lwd=0.025) + # DA borders
coord_equal() + # fixed x and y scales
facet_wrap(~ BufferSize, ncol = 2) + # one plot per buffer size
scale_fill_gradient2(low = "green", mid = "grey", high = "red", # colors
midpoint = 10000, name = "Expenditures\n(thousands)") + # legend options
theme(axis.text = element_blank(), # change the theme options
axis.title = element_blank(), # remove axis titles
axis.ticks = element_blank()) # remove axis ticks

Make the value of the fill the actual fill in ggplot2

Is there a way to have the value of the fill (the label) become the fill itself? For instance, in a stacked bar plot, I have
require(ggplot2)
big_votes_movies = movies[movies$votes > 100000,]
p = ggplot(big_votes_movies, aes(x=rating, y=votes, fill=year)) + geom_bar(stat="identity")
Can the values of 1997 and whatnot be the fill itself? A motif plot, if you will? An example of a motif plot is:
If this is possible, can I also plot these values on polar coordinates, so the fill would become the value?
p + coord_polar(theta="y")
There is a way to do it, but it's a little ugly.
When I first looked at it, I wondered if it could be done using geom_text, but although it gave a representation, it didn't really fit the motif structure. This was a first attempt:
require(ggplot2)
big_votes_movies = movies[movies$votes > 100000,]
p <- ggplot(big_votes_movies, aes(x=rating, y=votes, label=year))
p + geom_text(size=12, aes(colour=factor(year), alpha=0.3)) + geom_jitter(alpha=0) +
scale_x_continuous(limits=c(8, 9.5)) + scale_y_continuous(limits=c(90000,170000))
So then I realised you had to actually render the images within the grid/ggplot framework. You can do it, but you need to have physical images for each year (I created rudimentary images using ggplot, just to use only one tool, but maybe Photoshop would be better!) and then make your own grobs which you can add as custom annotations. You then need to make your own histogram bins and plot using apply. See below (it could be prettied up fairly easily). Sadly only works with cartesian co-ords :(
require(ggplot2)
require(png)
require(plyr)
require(grid)
years<-data.frame(year=unique(big_votes_movies$year))
palette(rainbow(nrow(years)))
years$col<-palette() # manually set some different colors
# create a function to write the "year" images
writeYear<-function(year,col){
png(filename=paste(year,".png",sep=""),width=550,height=300,bg="transparent")
im<-qplot(1,1,xlab=NULL,ylab=NULL) +
theme(axis.text.x = element_blank(),axis.text.y = element_blank()) +
theme(panel.background = element_rect(fill = "transparent",colour = NA), plot.background = element_rect(fill = "transparent",colour = NA), panel.grid.minor = element_line(colour = "white")) +
geom_text(label=year, size=80, color=col)
print(im)
dev.off()
}
#call the function to create the placeholder images
apply(years,1,FUN=function(x)writeYear(x["year"],x["col"]))
# then roll up the data
summarydata<-big_votes_movies[,c("year","rating","votes")]
# make own bins (a cheat)
summarydata$rating<-cut(summarydata$rating,breaks=c(0,8,8.5,9,Inf),labels=c(0,8,8.5,9))
aggdata <- ddply(summarydata, c("year", "rating"), summarise, votes = sum(votes) )
aggdata<-aggdata[order(aggdata$rating),]
aggdata<-ddply(aggdata,.(rating),transform,ymax=cumsum(votes),ymin=c(0,cumsum(votes))[1:length(votes)])
aggdata$imgname<-apply(aggdata,1,FUN=function(x)paste(x["year"],".png",sep=""))
#work out the upper limit on the y axis
ymax<-max(aggdata$ymax)
#plot the basic chart
z<-qplot(x=10,y=10,geom="blank") + scale_x_continuous(limits=c(8,9.5)) + scale_y_continuous(limits=c(0,ymax))
#make a function to create the grobs and call the annotation_custom function
callgraph<-function(df){
tiles<-apply(df,1,FUN=function(x)return(annotation_custom(rasterGrob(image=readPNG(x["imgname"]),
x=0,y=0,height=1,width=1,just=c("left","bottom")),
xmin=as.numeric(x["rating"]),xmax=as.numeric(x["rating"])+0.5,ymin=as.numeric(x["ymin"]),ym ax=as.numeric(x["ymax"]))))
return(tiles)
}
# then add the annotations to the plot
z+callgraph(aggdata)
and here's the plot with photoshopped images. I just save them over the generated imaages, and ran the second half of the script so as not to regenerate them.
OK - and then because it was bothering me, I decided to install extrafont and build the prettier graph using just R:
and here's the code:
require(ggplot2)
require(png)
require(plyr)
require(grid)
require(extrafont)
#font_import(pattern="Show") RUN THIS ONCE ONLY
#load the fonts
loadfonts(device="win")
#create a subset of data with big votes
big_votes_movies = movies[movies$votes > 100000,]
#create a custom palette and append to a table of the unique years (labels)
years<-data.frame(year=unique(big_votes_movies$year))
palette(rainbow(nrow(years)))
years$col<-palette()
#function to create the labels as png files
writeYear<-function(year,col){
png(filename=paste(year,".png",sep=""),width=440,height=190,bg="transparent")
im<-qplot(1,1,xlab=NULL,ylab=NULL,geom="blank") +
geom_text(label=year,size=70, family="Showcard Gothic", color=col,alpha=0.8) +
theme(axis.text.x = element_blank(),axis.text.y = element_blank()) +
theme(panel.background = element_rect(fill = "transparent",colour = NA),
plot.background = element_rect(fill = "transparent",colour = NA),
panel.grid.minor = element_line(colour = "transparent"),
panel.grid.major = element_line(colour = "transparent"),
axis.ticks=element_blank())
print(im)
dev.off()
}
#call the function to create the placeholder images
apply(years,1,FUN=function(x)writeYear(x["year"],x["col"]))
#summarize the data, and create bins manually
summarydata<-big_votes_movies[,c("year","rating","votes")]
summarydata$rating<-cut(summarydata$rating,breaks=c(0,8,8.5,9,Inf),labels=c(0,8,8.5,9))
aggdata <- ddply(summarydata, c("year", "rating"), summarise, votes = sum(votes) )
aggdata<-aggdata[order(aggdata$rating),]
aggdata<-ddply(aggdata,.(rating),transform,ymax=cumsum(votes),ymin=c(0,cumsum(votes))[1:length(votes)])
#identify the image placeholders
aggdata$imgname<-apply(aggdata,1,FUN=function(x)paste(x["year"],".png",sep=""))
ymax<-max(aggdata$ymax)
#do the basic plot
z<-qplot(x=10,y=10,geom="blank",xlab="Rating",ylab="Votes \n",main="Big Movie Votes \n") +
theme_bw() +
theme(panel.grid.major = element_line(colour = "transparent"),
text = element_text(family="Kalinga", size=20,face="bold")
) +
scale_x_continuous(limits=c(8,9.5)) +
scale_y_continuous(limits=c(0,ymax))
#creat a function to create the grobs and return annotation_custom() calls
callgraph<-function(df){
tiles<-apply(df,1,FUN=function(x)return(annotation_custom(rasterGrob(image=readPNG(x["imgname"]),
x=0,y=0,height=1,width=1,just=c("left","bottom")),
xmin=as.numeric(x["rating"]),xmax=as.numeric(x["rating"])+0.5,ymin=as.numeric(x["ymin"]),ymax=as.numeric(x["ymax"]))))
return(tiles)
}
#add the tiles to the base chart
z+callgraph(aggdata)

Resources