Plotting Choropleth Maps from KML Data Using ggplot2 - r

How do I plot a choropleth or thematic map using ggplot2 from a KML data source?
Example KML: https://dl.dropbox.com/u/1156404/nhs_pct.kml
Example data: https://dl.dropbox.com/u/1156404/nhs_dent_stat_pct.csv
Here's what I've got so far:
install.packages("rgdal")
library(rgdal)
library(ggplot2)
fn='nhs_pct.kml'
#Look up the list of layers
ogrListLayers(fn)
#The KML file was originally grabbed from Google Fusion Tables
#There's only one layer...but we still need to identify it
kml=readOGR(fn,layer='Fusiontables folder')
#This seems to work for plotting boundaries:
plot(kml)
#And this:
kk=fortify(kml)
ggplot(kk, aes(x=long, y=lat,group=group))+ geom_polygon()
#Add some data into the mix
nhs <- read.csv("nhs_dent_stat_pct.csv")
kml#data=merge(kml#data,nhs,by.x='Name',by.y='PCT.ONS.CODE')
#I think I can plot against this data using plot()?
plot(kml,col=gray(kml#data$A.30.Sep.2012/100))
#But is that actually doing what I think it's doing?!
#And if so, how can experiment using other colour palettes?
#But the real question is: HOW DO I DO COLOUR PLOTS USING gggplot?
ggplot(kk, aes(x=long, y=lat,group=group)) #+ ????
So my question is: how do I use eg kml#data$A.30.Sep.2012 values to colour the regions?
And as a supplementary question: how might I then experiment with different colour palettes, again in the ggplot context?

Plotting maps in R is very often a pain. Here's an answer which largely follows Hadley's tutorial at https://github.com/hadley/ggplot2/wiki/plotting-polygon-shapefiles
library(maptools)
library(rgdal)
library(ggplot2)
library(plyr)
fn='nhs_pct.kml'
nhs <- read.csv("nhs_dent_stat_pct.csv")
kml <- readOGR(fn, layer="Fusiontables folder")
Note: I got a message about orphan holes. I included the following line after reading https://stat.ethz.ch/pipermail/r-help/2011-July/283281.html
slot(kml, "polygons") <- lapply(slot(kml, "polygons"), checkPolygonsHoles)
## The rest more or less follows Hadley's tutorial
kml.points = fortify(kml, region="Name")
kml.df = merge(kml.points, kml#data, by.x="id",by.y="Name",sort=FALSE)
kml.df <- merge(kml.df,nhs,by.x="id",by.y="PCT.ONS.CODE",sort=FALSE,all.x=T,all.y=F)
## Order matters for geom_path!
kml.df <- kml.df[order(kml.df$order),]
nhs.plot <- ggplot(kml.df, aes(long,lat,group=group,fill=A.30.Sep.2012)) +
geom_polygon() +
geom_path(color="gray") +
coord_equal() +
scale_fill_gradient("The outcome") +
scale_x_continuous("") + scale_y_continuous("") + theme_bw()

Related

Plot pre-1990 world map in R

I am working with some global data from before 1991, so before the USSR, Yugoslavia and Czechoslovakia split up. I would like to plot the data using rworldmap or maps, but the package appears to only have the modern world map easily accessible. All the pre-1991 countries show up blank and with the boundaries dividing their post-1991 counterparts.
This code produces the historical map:
if (requireNamespace("mapdata", quietly=TRUE) && packageVersion("mapdata") >= "2.3")
{map("mapdata::worldLores", fill = TRUE, col = 1:10)}
EDIT: also, as per the helpful comment below, a historical map shapefile is easily obtained from:
library(cshapes)
cshp.data<-cshp(as.Date("1990-01-01"))
plot(cshp.data)
But I cannot figure out if it is possible to combine this with the rworldmap functions ... or if I will have to figure out how to use the maps package, which seems to work differently. (Or maybe there is a ggplot solution?)
The rworldmap code I use currently (to get the modern map) is:
#make example data including Soviet Union
country <- as.vector(c("Afghanistan","Australia","Iceland","Soviet Union",
"Zimbabwe"))
value <- as.vector(c(5,10,100,10,50))
df<-data.frame(country,value)
#make map
map1 <- joinCountryData2Map(df, joinCode = "NAME", nameJoinColumn =
"country")
mapCountryData( map1, addLegend=F, catMethod="fixedWidth",
nameColumnToPlot="value" )
#...Soviet Union is blank
Ahah, there is a ggplot solution using the old map from the mapdata package:
library(ggplot2)
library(dplyr)
library(mapdata)
df<-data.frame(country=c("Afghanistan","Australia","Iceland","USSR","Zimbabwe"),
value=c(5,10,100,10,50),stringsAsFactors=FALSE)
WorldData <- map_data('worldLores') #use the old map
WorldData <- fortify(WorldData)
mapped <- ggplot() +
geom_map(data=WorldData, map=WorldData,
aes(x=long, y=lat, group=group, map_id=region),
fill="white", colour="#7f7f7f", size=0.5) +
geom_map(data=df, map=WorldData,
aes(fill=value, map_id=country),
colour="#7f7f7f", size=0.5)
mapped
(mapping code borrowed from this post, cheers #hrbrmstr)

using topoJSON in ggplot

I'm trying to plot (using ggplot) a topoJSON file I generated from https://pitchinteractiveinc.github.io/tilegrams/.
I used the code below to try to plot the example npr 1-to-1 data:
library(rgeos)
library(rgdal)
library(ggplot2)
library(dplyr)
map = readOGR("data/npr.json", "tiles")
map_df <- fortify(map)
gg = ggplot(data = map_df, aes(long,lat, group=group))
gg = gg + geom_polygon(colour="gray65", size=1.0)
print(gg)
The result is not right.
I've tried plotting this with geom_map, and tried adding coord_equal and coord_map without impact.
I also tried to plot a single polygon and got the image below. Perhaps it suggests that points of the polygon are in incorrect order? Anyone have an idea on how to correct?
Actually, the issues seems to be in the conversion done by the readOGR. I imported the json manually and extract the polygons and it worked fine.

Plotting OpenStreetMap with ggmap

I'm trying to get districts of Warsaw and draw them on google map. Using this code, where 2536107 is relation code for OpenStreetMap single Warsaw district, gives me almost what I want but with a few bugs. There is general outline but also lines between points which shouldn't be connected. What am I doing wrong?
map <- get_googlemap('warsaw', zoom =10)
warszawa <- get_osm(relation(2536107), full = T)
warszawa.sp <- as_sp(warszawa, what='lines')
warsawfort <- fortify(warszawa.sp)
mapa_polski <- ggmap(map, extent='device', legend="bottomleft")
warsawfort2 <- geom_polygon(aes(x = long, y = lat),
data = warsawfort, fill="blue", colour="black",
alpha=0.0, size = 0.3)
base <- mapa_polski + warsawfort2
base
Edit: I figured it must be somehow connected with order of plotting every point/line but I have no idea how to fix this.
There is a way to generate your map without using external packages: don't use osmar...
This link, to the excellent Mapzen website, provides a set of shapefiles of administrative areas in Poland. If you download and unzip it, you will see a shapfile set called warsaw.osm-admin.*. This is a polygon shapefile of all the districts in Poland, conveniantly indexed by osm_id(!!). The code below assumes you have downloaded the file and unzipped it into the "directory with your shapefiles".
library(ggmap)
library(ggplot2)
library(rgdal)
setwd(" <directory with your shapefiles> ")
pol <- readOGR(dsn=".",layer="warsaw.osm-admin")
spp <- pol[pol$osm_id==-2536107,]
wgs.84 <- "+proj=longlat +datum=WGS84"
spp <- spTransform(spp,CRS(wgs.84))
map <- get_googlemap('warsaw', zoom =10)
spp.df <- fortify(spp)
ggmap(map, extent='device', legend="bottomleft") +
geom_polygon(data = spp.df, aes(x = long, y=lat, group=group),
fill="blue", alpha=0.2) +
geom_path(data=spp.df, aes(x=long, y=lat, group=group),
color="gray50", size=0.3)
Two nuances: (1) The osm IDs are stored as negative numbers, so you have to use, e.g.,
spp <- pol[pol$osm_id==-2536107,]
to extract the relevant district, and (2) the shapefile is not projected in WGS84 (long/lat). So we have to reproject it using:
spp <- spTransform(spp,CRS(wgs.84))
The reason osmar doesn't work is that the paths are in the wrong order. Your warszawa.sp is a SpatialLinesDataframe, made up of a set of paths (12 in your case), each of which is made up of a set of line segments. When you use fortify(...) on this, ggplot tries to combine them into a single sequence of points. But since the paths are not in convex order, ggplot tries, for example, to connect a path that ends in the northeast, to a path the begins in the southwest. This is why you're getting all the extra lines. You can see this by coloring the segments:
xx=coordinates(warszawa.sp)
colors=rainbow(11)
plot(t(bbox(warszawa.sp)))
lapply(1:11,function(i)lines(xx[[i]][[1]],col=colors[i],lwd=2))
The colors are in "rainbow" order (red, orange, yellow, green, etc.). Clearly, the lines are not in that order.
EDIT Response to #ako's comment.
There is a way to "fix" the SpatialLines object, but it's not trivial. The function gPolygonize(...) in the rgeos package will take a list of SpatialLines and convert to a SpatialPolygons object, which can be used in ggplot with fortify(...). One huge problem (which I don't understand, frankly), is that OP's warszaw.sp object has 12 lines, two of which seem to be duplicates - this causes gPolygonize(...) to fail. So if you create a SpatialLines list with just the first 11 paths, you can convert warszawa.sp to a polygon. This is not general however, as I can't predict how or if it would work with other SpatialLines objects converted from osm. Here's the code, which leads to the same map as above.
library(rgeos)
coords <- coordinates(warszawa.sp)
sll <- lapply(coords[1:11],function(x) SpatialLines(list(Lines(list(Line(x[[1]])),ID=1))))
spp <- gPolygonize(sll)
spp.df <- fortify(spp)
ggmap(map, extent='device', legend="bottomleft") +
geom_polygon(data = spp.df, aes(x = long, y=lat, group=group),
fill="blue", alpha=0.2) +
geom_path(data=spp.df, aes(x=long, y=lat, group=group),
color="gray50", size=0.3)
I am not sure this is a general hangup--I can reproduce your example and see the issue. My first thought was that you didn't supply group=id which are typically used for polygons with many lines, but you have lines, so that should not be needed.
The only way I could get it to display properly was by changing your lines into a polygon off script. Qgis' line to polygon didn't get this "right", getting a large donut hole, so I used ArcMap, which produced a full polygon. If this is a one off that may work for your workflow. Odds are it is not. In that case, perhaps RGDAL can transform lines to polygons, assuming that is indeed a general problem.
Upon reading the polygon shapefile and fortifying that, your code ran without problems.

Importing additional shapefile data into R using ggplot2 and the "fortify" function

I am trying to do spatial analysis of data by U.S. congressional district in R using ggplot2.
Plotting the map of congressional districts is working fine. The shapefile is available here: http://dds.cr.usgs.gov/pub/data/nationalatlas/cgd113p010g.shp_nt00845.tar.gz
Once unpacked, here's the relevant part of the code I'm running:
library(maptools)
library(rgeos)
library(maps)
library(plyr)
library(ggplot2)
cds13 <- readShapeSpatial("cgd113p010g.shp")
cds13.map <- fortify(cds13)
p <- ggplot() + geom_polygon(aes(x=long, y=lat, group=group), data=cds13.map, fill="white", color="light gray")
p <- p + ylim(c(25,50)) + xlim(c(-125,-65))
p
The geo files, however, have other useful data I'd like to add into the fortified data frame (in this case, cds13.map). See, for example, cds13$CONG_DIST (the district number) and cd13$CONG_REP (the name of the current Representative).
Is there an easy way to import such variables of interest, ideally through a call in fortify (or perhaps using merge)?
CONG_DIST and CONG_REP are located in the slot data (cds13#data) of your imported shapefile. You can add those data to fortified cds13.map using function merge(). To merge both objects you should use id column of the cds13.map object and row names of the cds13#data object.
cds13.merged<-merge(cds13.map,cds13#data,by.x="id",by.y="row.names")

How to change ggplot legend labels and names with two layers?

I am plotting the longitude and latitude coordinates of two different data frames in São Paulo map using ggmap and ggplot packages and want to label manually each legend layer:
update: I edited my code below to become fully reproducible (I was using the geocode function instead of get_map).
update: I would like to do this without combining the data frames.
require(ggmap)
sp <- get_map('sao paulo', zoom=11, color='bw')
restaurants <- data.frame(lon=c(-46.73147, -46.65389, -46.67610),
lat=c(-23.57462, -23.56360, -23.53748))
suppliers <- data.frame(lon=c(-46.70819,-46.68155, -46.74376),
lat=c(-23.53382, -23.53942, -23.56630))
ggmap(sp)+geom_point(data=restaurants, aes(x=lon, y=lat),color='blue',size=4)+geom_point(data=suppliers, aes(x=lon, y=lat), color='red', size=4)
I have looked to several questions and tried different ways without success. Does anyone know how can I insert legend and label the blue points as restaurants and the red points as suppliers?
Now that your code is reproducible (thanks!):
dat <- rbind(restaurants,suppliers)
dat$grp <- rep(c('Restaurants','Suppliers'),each = 3)
ggmap(sp) +
geom_point(data=dat, aes(x=lon, y=lat,colour = grp),size = 4) +
scale_colour_manual(values = c('red','blue'))

Resources