Plotting to shapefile region with ggplot / ggmap - r

I have a shapefile showing the map of England with regions mapped on it:
England <- readOGR(dsn = "...")
England.fort <- fortify(England, region='regionID')
England.fort <-England.fort[order(England.fort$order), ]
giving me England.fort:
England.fort
>long
>lat
>order
>hole
>piece
>id #contains the region IDs
>group #contains the region IDs
>Total #I want to plot this
Shapefile from here: https://geoportal.statistics.gov.uk/Docs/Boundaries/Local_authority_district_(GB)_2014_Boundaries_(Generalised_Clipped).zip
I want to plot the regions showing the total number of people in each:
p <- ggplot(data=England.fort, aes(x=long, y=lat, group=group, fill="Total")) +
geom_polygon(colour='black', fill='white') + theme_bw()
But It gives me a blank map off England with all the regions white.

ggplot(data=England.fort, aes(x=long, y=lat, group=group, Fill=Total)) +
geom_polygon() +
theme_bw()
Does the trick. Thanks

Related

Ggplot2 - Map polygons aes fill in ggplot() versus geom()

I am practicing the ggplot2 grammar of graphics on World maps using the base R dataset and the ggplot2 and mapproj packages.
When building a map that colours countries by a random variable (called "CountryColour" in the following example):
world_map <- map_data("world")
country_colours <- data.frame(region = c(names(table(world_map$region))),
colour= sample(c(1:20), length(names(table(world_map$region))), replace = TRUE))
world_map <- merge(world_map, country_colours)
world_map <- world_map[order(world_map$region, world_map$order),]
I happened to include the aes(fill) argument in the ggplot component:
ggplot(world_map[abs(world_map$long) < 180,], aes(x=long, y=lat, group=group, fill=colour)) +
geom_polygon(color="black") +
coord_map(projection = "mercator")
Now, if I include it in the ggplot() component itself I get EXACTLY the same output:
ggplot(world_map[abs(world_map$long) < 180,], aes(x=long, y=lat, group=group)) +
geom_polygon(aes(fill=colour), color="black") +
coord_map(projection = "mercator")
I would like to understand the conceptual difference between placing the fill function in one place or the other.

Cleaning up a map using geom_tile

Thanks to help from some users on this site, I was able to get a nice map plot for some data using geom_point. (Get boundaries to come through on states) However, now I'm trying to clean it up as I have more years to plot and want to make sure the plot is working and providing good information. After some further research, it seems the geom_tile would actually be better for this as it would shy away from points and use a gradient.
The problem I'm running into is getting the code to work with geom_tile. It isn't plot anything and I'm not sure why.
Here's the dataset :
https://www.dropbox.com/s/0evuvrlm49ab9up/PRISM_1895_db.csv?dl=0
Here's the original code with geom_points :
PRISM_1895_db <- read.csv("/.../PRISM_1895_db.csv")
regions<- c("north dakota","south dakota","nebraska","kansas","oklahoma","texas","minnesota","iowa","missouri","arkansas", "illinois", "indiana", "wisconsin")
ggplot() +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group)) +
geom_point(data = PRISM_1895_db, aes(x = longitude, y = latitude, color = APPT), alpha = .5, size = 3.5) +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group), color="white", fill=NA)
And here is the code I've been trying, but none of the data is showing up.
ggplot() +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group)) +
geom_tile(data = PRISM_1895_db, aes(x = longitude, y = latitude, fill = APPT), alpha = 0.5, color = NA)
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group), color="white", fill=NA)
geom_tile needs your x and y values to be sampled on an regular grid. It needs to be able to tile the surface in rectangles. So your data is irregularly sampled, it's not possible to divide up the raw data into a bunch of nice tiles.
One option is to use the stat_summary2d layer to divide your data into boxes and calculate the average APPT for all points in that box. This will allow you to create regular tiles. For example
ggplot() +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group)) +
stat_summary2d(data=PRISM_1895_db, aes(x = longitude, y = latitude, z = APPT)) +
geom_polygon(data=subset(map_data("state"), region %in% regions), aes(x=long, y=lat, group=group), color="white", fill=NA)
which produces
you can look at other options to control this bin sizes if you like. But as you can see it's "smoothing" out the data by taking averages inside bins.

Overlaying polygons on ggplot map

I'm struggling to overlay neighborhood boundaries on a google map. I'm trying to follow this code. My version is below. Do you see anything obviously wrong?
#I set the working directory just before this...
chicago = readOGR(dsn=".", layer="CommAreas")
overlay <- spTransform(chicago,CRS("+proj=longlat +datum=WGS84"))
overlay <- fortify(overlay)
location <- unlist(geocode('4135 S Morgan St, Chicago, IL 60609'))+c(0,.02)
ggmap(get_map(location=location,zoom = 10, maptype = "terrain", source = "google",col="bw")) +
geom_polygon(aes(x=long, y=lat, group=group), data=overlay, alpha=0)+
geom_path(color="red")
Any insight would be much appreciated. Thanks for your help and patience.
This worked for me:
library(rgdal)
library(ggmap)
# download shapefile from:
# https://data.cityofchicago.org/api/geospatial/cauq-8yn6?method=export&format=Shapefile
# setwd accordingly
overlay <- readOGR(".", "CommAreas")
overlay <- spTransform(overlay, CRS("+proj=longlat +datum=WGS84"))
overlay <- fortify(overlay, region="COMMUNITY") # it works w/o this, but I figure you eventually want community names
location <- unlist(geocode('4135 S Morgan St, Chicago, IL 60609'))+c(0,.02)
gmap <- get_map(location=location, zoom = 10, maptype = "terrain", source = "google", col="bw")
gg <- ggmap(gmap)
gg <- gg + geom_polygon(data=overlay, aes(x=long, y=lat, group=group), color="red", alpha=0)
gg <- gg + coord_map()
gg <- gg + theme_bw()
gg
You might want to restart your R session in the event there's anything in the environment causing issues, but you can set the line color and alpha 0 fill in the geom_polygon call (like I did).
You can also do:
gg <- gg + geom_map(map=overlay, data=overlay,
aes(map_id=id, x=long, y=lat, group=group), color="red", alpha=0)
instead of the geom_polygon which gives you the ability to draw a map and perform aesthetic mappings in one call vs two (if you're coloring based on other values).

Choropleth map in ggplot with polygons that have holes

I'm trying to draw a choropleth map of Germany showing poverty rate by state (inspired by this question).
The problem is that some of the states (Berlin, for example) are completely surrounded by other states (Brandenburg), and I'm having trouble getting ggplot to recognize the "hole" in Brandenburg.
The data for this example is here.
library(rgdal)
library(ggplot2)
library(RColorBrewer)
map <- readOGR(dsn=".", layer="germany3")
pov <- read.csv("gerpoverty.csv")
mrg.df <- data.frame(id=rownames(map#data),ID_1=map#data$ID_1)
mrg.df <- merge(mrg.df,pov, by="ID_1")
map.df <- fortify(map)
map.df <- merge(map.df,mrg.df[,c("id","poverty")], by="id")
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(aes(fill=poverty))+
geom_path(colour="grey50")+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
Notice how the colors for Berlin and Brandenburg (in the northeast) are identical. They shouldn't be - Berlin's poverty rate is much lower than Brandenburg. It appears that ggplot is rendering the Berlin polygon and then rendering the Brandenburg polygon over it, without the hole.
If I change the call to geom_polygon(...) as suggested here, I can fix the Berlin/Brandenburg problem, but now the three northernmost states are rendered incorrectly.
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(aes(group=poverty, fill=poverty))+
geom_path(colour="grey50")+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
What am I doing wrong??
This is just an expansion on #Ista's answer, which does not require that one knows which states (Berlin, Bremen) need to be rendered last.
This approach takes advantage of the fact that fortify(...) generates a column, hole which identifies whether a group of coordinates are a hole. So this renders all regions (id's) with any holes before (e.g. underneath) the regions without holes.
Many thanks to #Ista, without whose answer I could not have come up with this (believe me, I spent many hours trying...)
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(data=map.df[map.df$id %in% map.df[map.df$hole,]$id,],aes(fill=poverty))+
geom_polygon(data=map.df[!map.df$id %in% map.df[map.df$hole,]$id,],aes(fill=poverty))+
geom_path(colour="grey50")+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
You can plot the island polygons in a separate layer, following the example on the ggplot2 wiki. I've modified your merging steps to make this easier:
mrg.df <- data.frame(id=rownames(map#data),ID_1=map#data$ID_1)
mrg.df <- merge(mrg.df,pov, by="ID_1")
map.df <- fortify(map)
map.df <- merge(map.df,mrg.df, by="id")
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(aes(fill=poverty), color = "grey50", data =subset(map.df, !Id1 %in% c("Berlin", "Bremen")))+
geom_polygon(aes(fill=poverty), color = "grey50", data =subset(map.df, Id1 %in% c("Berlin", "Bremen")))+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
As an unsolicited act of evangelism, I encourage you to consider something like
library(ggmap)
qmap("germany", zoom = 6) +
geom_polygon(aes(x=long, y=lat, group=group, fill=poverty),
color = "grey50", alpha = .7,
data =subset(map.df, !Id1 %in% c("Berlin", "Bremen")))+
geom_polygon(aes(x=long, y=lat, group=group, fill=poverty),
color = "grey50", alpha= .7,
data =subset(map.df, Id1 %in% c("Berlin", "Bremen")))+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))
to provide context and familiar reference points.
Just to add another small improvement to #Ista's and #jhoward's answers (thanks a lot for your help!).
The modification of #jhoward could be easily wrapped in a small function like this
gghole <- function(fort){
poly <- fort[fort$id %in% fort[fort$hole,]$id,]
hole <- fort[!fort$id %in% fort[fort$hole,]$id,]
out <- list(poly,hole)
names(out) <- c('poly','hole')
return(out)
}
# input has to be a fortified data.frame
Then, one doesn't need to recall every time how to extract holes info. The code would look like
ggplot(map.df, aes(x=long, y=lat, group=group)) +
geom_polygon(data=gghole(map.df)[[1]],aes(fill=poverty),colour="grey50")+
geom_polygon(data=gghole(map.df)[[2]],aes(fill=poverty),colour="grey50")+
# (optionally). Call by name
# geom_polygon(data=gghole(map.df)$poly,aes(fill=poverty),colour="grey50")+
# geom_polygon(data=gghole(map.df)$hole,aes(fill=poverty),colour="grey50")+
scale_fill_gradientn(colours=brewer.pal(5,"OrRd"))+
labs(x="",y="")+ theme_bw()+
coord_fixed()
Alternatively you could create that map using rworldmap.
library(rworldmap)
library(RColorBrewer)
library(rgdal)
map <- readOGR(dsn=".", layer="germany3")
pov <- read.csv("gerpoverty.csv")
#join data to the map
sPDF <- joinData2Map(pov,nameMap='map',nameJoinIDMap='VARNAME_1',nameJoinColumnData='Id1')
#default map
#mapPolys(sPDF,nameColumnToPlot='poverty')
colours=brewer.pal(5,"OrRd")
mapParams <- mapPolys( sPDF
,nameColumnToPlot='poverty'
,catMethod="pretty"
,numCats=5
,colourPalette=colours
,addLegend=FALSE )
do.call( addMapLegend, c( mapParams
, legendLabels="all"
, legendWidth=0.5
))
#to test state names
#text(pov$x,pov$y,labels=pov$Id1)

Converting polygons into latitude and longitude in R/Excel

I have a data set about all the counties in Minnesota, and one of the columns is its shape. For each county it looks something like this:
For Aitkin County:
<Polygon><outerBoundaryIs><LinearRing><coordinates>-93.051956,46.15767700000001,0 -93.434006,46.15313,0 -93.43261,46.240253,0 -93.80480900000001,46.23817100000001,0 -93.80933400000001,46.580681,0 -93.77426199999999,46.59050400000001,0 -93.77412400000001,46.802605,0 -93.77500100000002,47.030445,0 -93.058258,47.022362,0 -93.05964600000001,46.766071,0 -93.05208600000002,46.417576,0 -93.051956,46.15767700000001,0</coordinates></LinearRing></outerBoundaryIs></Polygon>
I'm fairly new to R and know nothing about Google API, HTML, etc. I'm trying to use the ggplot2 and maps packages to create an intensity map for various aspects of all the counties in Minnesota. Is there a way to use these coordinates as they are to make a layer of counties, or do I need to do something else?
Here's the code I have so far:
Map of MN:
library(maps)
library(ggplot2)
all_states <- map_data("state")
mn<-subset(all_states, region %in% c("minnesota"))
p<-ggplot()
p<-p+geom_polygon(data=mn, aes(x=long, y=lat, group=group), colour="black", fill="white")
p
And my plan is to modify the following to apply to each county, once I get those polygons:
dataset <- data.frame(region=states,val=runif(49, 0,1))
us_state_map <- map_data('state')
map_data <- merge(us_state_map, dataset, by='region', all=T)
map_data <- map_data[order(map_data$order), ]
(qplot(long, lat, data=map_data, geom="polygon", group=group, fill=val)
+ theme_bw() + labs(x="", y="", fill="")
+ scale_fill_gradient(low='#EEEEEE', high='darkgreen')
+ opts(title="Title",
legend.position="bottom", legend.direction="horizontal"))
Any suggestions would be greatly appreciated!

Resources