How to locate points on map from shapefile in ggplot (ggmaps) - r

I want to use the gglocator() function from the ggmaps package to identify coordinates on a map to position sub-plots. The following example yields an error message.
library(maptools)
library(ggmap)
## Get shapefiles for plotting
download.file("http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2010_60M_SH.zip",paste0(tempdir(),"/NUTS_2010_60M_SH.zip"))
unzip(paste0(tempdir(),"/NUTS_2010_60M_SH.zip"),exdir=tempdir())
eurMap <- readShapePoly(fn=paste0(tempdir(),"/NUTS_2010_60M_SH/data/NUTS_RG_60M_2010"))
eurMapDf <- fortify(eurMap, region='NUTS_ID')
## Create the figure
gg <– ggplot(data=eurMapDf) + geom_polygon(aes(x=long, y=lat,group=group))
gg
## use gglocator()
gglocator()
## > Error in `[.data.frame`(data, , deparse(mapping$x)) :
## undefined columns selected
gglocator() works fine with the following example which leads me to suspect that the problem is related to the data passed into the plotting function (the shapefile).
## gglocator works fine with this example
qmap("National Capital Region", zoom = 11)
gglocator(2)
Is there a way to plot shapefiles with ggplot and use gglocator() to identify points? If not, is there any alternative locator function I could use? grid.locator() doesn't work either as it does not output coordinates in the right units. Any hint would be greatly appreciated.
Update
It looks like the mapping slot is not always defined when creating the gg object. I can make gglocator() run by explicitly setting the mapping to the variables of the data slot. E.g. the following works:
gg$mapping$x <- substitute(long)
gg$mapping$y <- substitute(lat)
gg
## gglocator() works as expected
gglocator()
So I guess my real question is how to ensure that mapping is set when calling ggplot.
Update 2
All works fine when the mapping is provided in the ggplot function call and not as argument to geom_polygon(). E.g.
gg2 <- ggplot(data=eurMapDf,aes(x=long, y=lat,group=group)) + geom_polygon()
gg2
## works fine now
gglocator()
I am not sure if there is a reason why mapping needs to be declared in the ggplot() call or if this is a bug. Many of the examples found around the web use the syntax I used first (or to put it differently, the syntax I used first was copy-pasted from one of the many examples I found around the web).

Related

Assigned variable is changing when object is modified - ggplot [duplicate]

I'm trying to copy a ggplot object and then change some properties of the new copied object as, for instance, the colour line to red.
Assume this code:
df = data.frame(cbind(x=1:10, y=1:10))
a = ggplot(df, aes(x=x, y=y)) + geom_line()
b = a
Then, if I change the colour of line of variable a
a$layers[[1]]$geom_params$colour = "red"
it also changes the colour of b
> b$layers[[1]]$geom_params$colour
[1] "red" # why it is not "black"?
I wish I could have two different objects a and b with different characteristics. So, in order to do this in the correct way, I would need to call the plot again for b using b = ggplot(df, aes(xy, y=z)) + geom_line(). However, at this time in the algorithm, there is no way to know the plot command ggplot(df, aes(x=x, y=y)) + geom_line()
Do you know what's wrong with this? Is ggplot objects treated in a different manner?
Thanks!
The issue here is that ggplot uses the proto library to mimic OO-style objects. The proto library relies on environments to collect variables for objects. Environments are passed by reference which is why you are seeing the behavior you are (and also a reason no one would probably recommend changing the properties of a layer that way).
Anyway, adapting an example from the proto documentaiton, we can try to make a deep copy of the laters of the ggplot object. This should "disconnect" them. Here's such a helper function
duplicate.ggplot<-function(x) {
require(proto)
r<-x
r$layers <- lapply(r$layers, function(x) {
as.proto(as.list(x), parent=x)
})
r
}
so if we run
df = data.frame(cbind(x=1:10, y=1:10))
a = ggplot(df, aes(x=x, y=y)) + geom_line()
b = a
c = duplicate.ggplot(a)
a$layers[[1]]$geom_params$colour = "red"
then plot all three, we get
which shows we can change "c" independently from "a"
Ignoring the specifics of ggplot, there's a simple trick to make a deep copy of (almost) any object in R:
obj_copy <- unserialize(serialize(obj, NULL))
This serializes the object to a binary representation suitable for writing to disk and then reconstructs the object from that representation. It's equivalent to saving the object to a file and then loading it again (i.e. saveRDS followed by readRDS), only it never actually saves to a file. It's probably not the most efficient solution, but it should work for just about any object that can be saved to a file.
You can define a deepcopy function using this trick:
deepcopy <- function(p) {
unserialize(serialize(p, NULL))
}
This seems to successfully break the links between related ggplots.
Obviously, this will not work for objects that cannot be serialized, such as big matrices from the bigmemory package.

Trouble with error code: "Error: Fill argument neither colors nor valid variable name(s)"

I am trying to make a map with tmap and when I try to view it, this error code comes up:
Error: Fill argument neither colors nor valid variable name(s)
Here is my code.
tm1=tm_shape(myshptime1)+
tm_polygons("zeta.x",style="pretty",
palette="PuOr", title="Spatial Residual \n Relative Risk ")+
tm_layout(frame = FALSE)
tm1
I've tried adding a fill argument, I've tried adding other types of arguments, and I get the same error. I also tried switching from "plot" to "view" to display the map under a different mode and it doesn't work.
This seems to be an issue with your shapefile, which we do not have access to. As your problem is not quite reproducible it will be difficult to help you.
Consider the following example, which uses your exact code but with the shapefile of North Carolina (it ships with the {sf} package, so it is widely available) in place of your myshptime1 object; I have also swapped your "zeta.x" for "AREA", as there is no zeta.x variable in the {sf} shapefile.
library(tmap)
library(sf)
shape <- st_read(system.file("shape/nc.shp", package="sf")) # included with sf package
tm1 = tm_shape(shape)+
tm_polygons("AREA",style="pretty",
palette="PuOr", title="Spatial Residual \n Relative Risk ")+
tm_layout(frame = FALSE)
tm1

r geom_map fails with GeoJSON map simplified with gSimplify

I'm constructing world maps with countries color-filled with the (continuous) value depending on a column in a data frame called temp.sp. I want to put several of these maps in a graph. I construct each map using ggplot with geom_map and then construct and display the graphs using multiplot() which uses grid code.
I'm using a GeoJSON map (world <- readOGR(dsn = "ne_50m_admin_0_countries.geojson", layer = "OGRGeoJSON")). The resulting SpatialPolygonsDataFrame is 4.1 Mb and the dataframe that results from worldMap <- broom::tidy(world, region = "iso_a3") has 93391 rows. So when I run multiplot with 4 plot files, it takes a long time.
I thought that I could speed up the printing by simplifying the world map with gSimplify using code like world.simp <- gSimplify(world, tol = .1, topologyPreserve = TRUE). The resulting data frame, worldMap.simp only has 27033 rows but when I use this map I get the error message Error in unit(x, default.units) : 'x' and 'units' must have length > 0.
The error message is generated when I run this code with worldMap.simp. When I use worldMap I have no problems.
gg <- ggplot(temp.sp, aes(map_id = id))
gg <- gg + geom_map(aes(fill = temp.sp$value), map = worldMap.simp, color = "white").
I tried converting temp.sp$value to factor but it made no difference.
To summarize, using a gSimplified map causes the displaying of a graph produced with ggplot and geom_map to fail.
Rather than try to figure out what was going wrong with gSimplify, I found and downloaded a lower resolution map from http://geojson.xyz. The one I'm currently using is
https://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_admin_0_countries.geojson
Note that it has a similar filename, but with 110m instead of 50m.

How to only change parameters for "lower" plots in the ggpairs function from GGally package

I have the following example
data(diamonds, package="ggplot2")
diamonds.samp <- diamonds[sample(1:dim(diamonds)[1],200),]
ggpairs(diamonds.samp, columns=8:10,
upper=list(continuous='cor'),
lower=list(continuous = 'points'),
diag=list(continuous='density'),
axisLabels='show'
)
Resulting in a really nice figure:
But my problem is that in the real dataset I have to many points whereby I would like to change the parameters for the point geom. I want to reduce the dot size and use a lower alpha value. I can however not doe this with the "param" option it applies to all plot - not just the lower one:
ggpairs(diamonds.samp, columns=8:10,
upper=list(continuous='cor'),
lower=list(continuous = 'points'),
diag=list(continuous='density'),
params=c(alpha=1/10),
axisLabels='show'
)
resulting in this plot:
Is there a way to apply parameters to only "lower" plots - or do I have to use the ability to create custom plots as suggested in the topic How to adjust figure settings in plotmatrix?
In advance - thanks!
There doesn't seem to be any elegant way to do it, but you can bodge it by writing a function to get back the existing subchart calls from the ggally_pairs() object and then squeezing the params in before the last bracket. [not very robust, it'll only work for if the graphs are already valid]
diamonds.samp <- diamonds[sample(1:dim(diamonds)[1],200),]
g<-ggpairs(diamonds.samp, columns=8:10,
upper=list(continuous='cor'),
lower=list(continuous = 'points'),
diag=list(continuous='density'),
axisLabels='show'
)
add_p<-function(g,i,params){
side=length(g$columns) # get number of cells per side
lapply(i,function(i){
s<-as.character(g$plots[i]) # get existing call as a template
l<-nchar(s)
p<-paste0(substr(s,1,l-1),",",params,")") # append params before last bracket
r<-i%/%side+1 # work out the position on the grid
c<-i%%side
array(c(p,r,c)) # return the sub-plot and position data
})
}
rep_cells<-c(4,7,8)
add_params<-"alpha=0.3, size=0.1, color='red'"
ggally_data<-g$data # makes sure that the internal parameter picks up your data (it always calls it's data 'ggally_data'
calls<-add_p(g,rep_cells,params=add_params) #call the function
for(i in 1:length(calls)){g<-putPlot(g,calls[[i]][1],as.numeric(calls[[i]][2]),as.numeric(calls[[i]][3]))}
g # call the plot

ggmap with geom_map superimposed

library(sp)
library(spdep)
library(ggplot2)
library(ggmap)
library(rgdal)
Get and fiddle with data:
nc.sids <- readShapePoly(system.file("etc/shapes/sids.shp", package="spdep")[1],ID="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))
nc.sids=spTransform(nc.sids,CRS("+init=epsg:4326"))
Get background map from stamen.com, plot, looks nice:
ncmap = get_map(location=as.vector(bbox(nc.sids)),source="stamen",maptype="toner",zoom=7)
ggmap(ncmap)
Create a data frame with long,lat,Z, and plot over the map and a blank plot:
ncP = data.frame(coordinates(nc.sids),runif(nrow(nc.sids)))
colnames(ncP)=c("long","lat","Z")
ggmap(ncmap)+geom_point(aes(x=long,y=lat,col=Z),data=ncP)
ggplot()+geom_point(aes(x=long,y=lat,col=Z),data=ncP)
give it some unique ids called 'id' and fortify (with vitamins and iron?)
nc.sids#data[,1]=1:nrow(nc.sids)
names(nc.sids)[1]="id"
ncFort = fortify(nc.sids)
Now, my map and my limits, I want to plot the 74 birth rate:
myMap = geom_map(aes(fill=BIR74,map_id=id),map=ncFort,data=nc.sids#data)
Limits = expand_limits(x=ncFort$long,y=ncFort$lat)
and on a blank plot I can:
ggplot() + myMap + Limits
but on a ggmap I can't:
ggmap(ncmap) + myMap + Limits
# Error in eval(expr, envir, enclos) : object 'lon' not found
Some versions:
> packageDescription("ggplot2")$Version
[1] "0.9.0"
> packageDescription("ggmap")$Version
[1] "2.0"
I can add geom_polygon to ggplot or ggmap and it works as expected. So something is up with geom_map....
The error message is, I think, the result of an inheritance issue. Typically, it comes about when different data frames are used in subsequent layers.
In ggplot2, every layer inherits default aes mappings set globally in the initial call to ggplot. For instance, ggplot(data = data, aes(x = x, y = y)) sets x and y mappings globally so that all subsequent layers expect to see x and y in whatever data frame has been assigned to them. If x and y are not present, an error message similar to the one you got results. See here for a similar problem and a range of solutions.
In your case, it's not obvious because the first call is to ggmap - you can't see the mappings nor how they are set because ggmap is all nicely wrapped up. Nevertheless, ggmap calls ggplot somewhere, and so default aesthetic mappings must have been set somewhere in the initial call to ggmap. It follows then that ggmap followed by geom_map without taking account of inheritance issues results in the error.
So, Kohske's advice in the earlier post applies - "you need to nullify the lon aes in geom_map when you use a different dataset". Without knowing too much about what has been set or how they've been set, it's probably simplest to globber the lot by adding inherit.aes = FALSE to the second layer - the call to geom_map.
Note that you don't get the error message with ggplot() + myMap + Limits because no aesthetics have been set in the ggplot call.
In what follows, I'm using R version 2.15.0, ggplot2 version 0.9.1, and ggmap version 2.1. I use your code almost exactly, except for the addition of inherit.aes = FALSE in the call to geom_map. That one small change allows ggmap and geom_map to be superimposed:
library(sp)
library(spdep)
library(ggplot2)
library(ggmap)
library(rgdal)
#Get and fiddle with data:
nc.sids <- readShapePoly(system.file("etc/shapes/sids.shp", package="spdep")[1],ID="FIPSNO", proj4string=CRS("+proj=longlat +ellps=clrk66"))
nc.sids=spTransform(nc.sids,CRS("+init=epsg:4326"))
#Get background map from stamen.com, plot, looks nice:
ncmap = get_map(location=as.vector(bbox(nc.sids)),source="stamen",maptype="toner",zoom=7)
ggmap(ncmap)
#Create a data frame with long,lat,Z, and plot over the map and a blank plot:
ncP = data.frame(coordinates(nc.sids),runif(nrow(nc.sids)))
colnames(ncP)=c("long","lat","Z")
ggmap(ncmap)+geom_point(aes(x=long,y=lat,col=Z),data=ncP)
ggplot()+geom_point(aes(x=long,y=lat,col=Z),data=ncP)
#give it some unique ids called 'id' and fortify (with vitamins and iron?)
nc.sids#data[,1]=1:nrow(nc.sids)
names(nc.sids)[1]="id"
ncFort = fortify(nc.sids)
#Now, my map and my limits, I want to plot the 74 birth rate:
myMap = geom_map(inherit.aes = FALSE, aes(fill=BIR74,map_id=id), map=ncFort,data=nc.sids#data)
Limits = expand_limits(x=ncFort$long,y=ncFort$lat)
# and on a blank plot I can:
ggplot() + myMap + Limits
# but on a ggmap I cant:
ggmap(ncmap) + myMap + Limits
The result from the last line of code is:

Resources