I have a data set (Data_Base) with the number of murders in one specific year for the 32 Mexican states. I download with the help of the gadm_sf_loadCountries the spatial data for this 32 Mexican states.
Now, I want to merge this data frame with the SF object in order to plot it with filled with the murders per state. The code is the following:
#Export spatial data from Mexico
mex.sf = gadm_sf_loadCountries("MEX", level=1, basefile="./")
mex.sf <- mex.sf$sf
#Merge the SF object and the Data frame
mex.sf <- left_join(Data_Base,mex.sf, by = c('State'='shapeName'))
#Run the graph code, I want to fill the map with the murders per state
ggplot(mex.sf)+geom_sf(aes(fill=murders)) +
scale_fill_manual(values=c("#d7301f","#fc8d59","#fdcc8a","#fef0d9"))
However, at the moment I do the graph I got the following error:
Error: stat_sf requires the following missing aesthetics: geometry
Run `rlang::last_error()` to see where the error occurred.
I honestly don´t know why is showing me this error because the geometry rows have been pasted at the moment I "left joined"it.
Does anyone knows why is this error?
TLDR: change the order of your objects in left_join(Data_Base,mex.sf, ...)
Long version: When joining data items to a {sf} spatial object the order of your objects matters.
The class output of a dplyr::*_join() family of functions (inner, left, whatever...) will be determined by the first argument. The presence of geometry column is not sufficient for the resulting object to behave as expected; class matters.
To illustrate my argument consider this piece of code (I don't have your data on hand, so I make do with the North Carolina shapefile that ships with {sf}):
library(sf)
library(dplyr)
shape <- st_read(system.file("shape/nc.shp", package="sf"))
data <- data.frame(CNTY_ID = 1825,
label = "this be county Ashe")
data_first <- inner_join(data, shape, by = "CNTY_ID") # joining shape to data = this will break later on
class(data_first)
# [1] "data.frame"
plot(data_first["label"]) # this will be a mess...
# Error in plot.window(...) : need finite 'ylim' values
# In addition: Warning messages:
# 1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
# 2: In min(x) : no non-missing arguments to min; returning Inf
# 3: In max(x) : no non-missing arguments to max; returning -Inf
shape_first <- inner_join(shape, data, by = "CNTY_ID") # joining data to shape = this is the way to go!
class(shape_first)
# [1] "sf" "data.frame"
plot(shape_first["label"]) # works like a charm...
Related
I hava another geodata question.
I am trying to access the elevation data for specific points with the get_elev_point ()
I have about 160 points that I would like to get the elevation data for, so I would like to not do it one by one but I do not know how to retrieve them all at once.
I have made a data.frame of all my points where the first column x is longitude and the second column y is latitude and these are the only data in the data.frame.
> elevationpoints <-get_elev_point(locations = pontok_df, units="meters", prj = ll_prj, src = "aws")
Error in seq.default(from = ceiling(min_tile[2]), to =
floor(max_tile[2])) : 'to' must be a finite number In addition:
Warning message: In log(tan(lat_rad) + (1/cos(lat_rad))) : NaNs
produced
But it does not seem to yield any results. Could you please help me? Much appreciated!
Kamilla
The data
I have two shapefiles marking the boundaries of national and provincial electoral constituencies in Pakistan.
The objective
I am attempting to use R to create a key that will generate a list of which provincial-level constituencies are "contained within" or otherwise intersecting with which national-level constituencies, based on their coordinates in this data. For example, NA-01 corresponds with PA-01, PA-02, PA-03; NA-02 corresponds with PA-04 and PA-05, etc. (The key will ultimately be used to link separate dataframes containing electoral results at the national and provincial level; that part I've figured out.)
I have only basic/intermediate R skills learned largely through trial and error and no experience working with GIS data outside of R.
The attempted solution
The closest solution I could find for this problem comes from this guide to calculating intersection areas in R. However, I have been unable to successfully replicate any of the three proposed approaches (either the questioner's use of a general TRUE/FALSE report on intersections, or the more precise calculations of area of overlap).
The code
# import map files
NA_map <- readOGR(dsn = "./National_Constituency_Boundary", layer = "National_Constituency_Boundary")
PA_map <- readOGR(dsn = "./Provincial_Constituency_Boundary", layer = "Provincial_Constituency_Boundary")
# Both are now SpatialPolygonsDataFrame objects of 273 and 577 elements, respectively.
# If relevant, I used spdpylr to tweak some of data attribute names (for use later when joining to electoral dataframes):
NA_map <- NA_map %>%
rename(constituency_number = NA_Cons,
district_name = District,
province = Province)
PA_map <- PA_map %>%
rename(province = PROVINCE,
district_name = DISTRICT,
constituency_number = PA)
# calculate intersections, take one
Results <- gIntersects(NA_map, PA_map, byid = TRUE)
# this creates a large matrix of 157,521 elements
rownames(Results) <- NA_map#data$constituency_number
colnames(Results) <- PA_map#data$constituency_number
Attempting to add the rowname/colname labels, however, gives me the error message:
Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent
Without the rowname/colname labels, I'm unable to read the overlay matrix, and unsure how to filter them so as to produce a list of only TRUE intersections that would help make a NA-PA key.
I also attempted to replicate the other two proposed solutions for calculating exact area of overlap:
# calculate intersections, take two
pi <- intersect(NA_map, PA_map)
# this generates a SpatialPolygons object with 273 elements
areas <- data.frame(area=sapply(pi#polygons, FUN = function(x) {slot(x, 'area')}))
# this calculates the area of intersection but has no other variables
row.names(areas) <- sapply(pi#polygons, FUN=function(x) {slot(x, 'ID')})
This generates the error message:
Error in `row.names<-.data.frame`(`*tmp*`, value = c("2", "1", "4", "5", :
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique value when setting 'row.names': ‘1’
So that when I attempt to attach areas to attributes info with
attArrea <- spCbind(pi, areas)
I get the error message
Error in spCbind(pi, areas) : row names not identical
Attempting the third proposed method:
# calculate intersections, take three
pi <- st_intersection(NA_map, PA_map)
Produces the error message:
Error in UseMethod("st_intersection") :
no applicable method for 'st_intersection' applied to an object of class "c('SpatialPolygonsDataFrame', 'SpatialPolygons', 'Spatial', 'SpatialPolygonsNULL', 'SpatialVector')"
I understand that my SPDF maps can't be used for this third approach, but wasn't clear from the description what steps would be needed to transform it and attempt this method.
The plea for help
Any suggestions on corrections necessary to use any of these approaches, or pointers towards some other method of figuring this, would be greatly appreciated. Thanks!
Here is some example data
library(raster)
p <- shapefile(system.file("external/lux.shp", package="raster"))
p1 <- aggregate(p, by="NAME_1")
p2 <- p[, 'NAME_2']
So we have p1 with regions, and p2 with lower level divisions.
Now we can do
x <- intersect(p1, p2)
# or x <- union(p1, p2)
data.frame(x)
Which should be (and is) the same as the original
data.frame(p)[, c('NAME_1', 'NAME_2')]
To get the area of the polygons, you can do
x$area <- area(x) / 1000000 # divide to get km2
There are likely to be many "slivers", very small polygons because of slight variations in borders. That might not matter to you.
But another approach could be matching by centroid:
y <- p2
e <- extract(p1, coordinates(p2))
y$NAME_1 <- e$NAME_1
data.frame(y)
Your code isn't self-contained, so I didn't try to replicate the errors you report.
However, getting the 'key' you want is very simple using the sf package (which is intended to supercede rgeos, rgdal and sp in the near future). See here:
library(sf)
# Download shapefiles
national.url <- 'https://data.humdata.org/dataset/5d48a142-1f92-4a65-8ee5-5d22eb85f60f/resource/d85318cb-dcc0-4a59-a0c7-cf0b7123a5fd/download/national-constituency-boundary.zip'
provincial.url <- 'https://data.humdata.org/dataset/137532ad-f4a9-471e-8b5f-d1323df42991/resource/c84c93d7-7730-4b97-8382-4a783932d126/download/provincial-constituency-boundary.zip'
download.file(national.url, destfile = file.path(tempdir(), 'national.zip'))
download.file(provincial.url, destfile = file.path(tempdir(), 'provincial.zip'))
# Unzip shapefiles
unzip(file.path(tempdir(), 'national.zip'), exdir = file.path(tempdir(), 'national'))
unzip(file.path(tempdir(), 'provincial.zip'), exdir = file.path(tempdir(), 'provincial'))
# Read map files
NA_map <- st_read(dsn = file.path(tempdir(), 'national'), layer = "National_Constituency_Boundary")
PA_map <- st_read(dsn = file.path(tempdir(), 'provincial'), layer = "Provincial_Constituency_Boundary")
# Get sparse list representation of intersections
intrs.sgpb <- st_intersects(NA_map, PA_map)
length(intrs.sgpb) # One list element per national constituency
# [1] 273
print(intrs.sgpb[[1]]) # Indices of provnicial constituencies intersecting with first national constituency
# [1] 506 522 554 555 556
print(PA_map$PROVINCE[intrs.sgpb[[1]]])[1] # Name of first province intersecting with first national constituency
# [1] KHYBER PAKHTUNKHWA
I'm having a lot of difficulty creating a prediction grid (for the new_data argument) to use with the autoKrige function in the automap package.
I've already tried following the steps in this post (How to subset SpatialGrid using SpatialPolygon) but get the following error :
Error in x#coords[i, , drop = FALSE] :
(subscript) logical subscript too long
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
My (limited) understanding is the error relates to there being no non-missing arguments because it is an empty grid. This is fine - all I want is an empty grid constrained by a polygon from a shapefile.
Here is the code I'm working with:
shp <- shapefile("C://path/path/Tobay_Box2.shp")
shp <- spTransform (shp,"+proj=utm +ellps=WGS84 +datum=WGS84")
grid <- GridTopology(cellcentre.offset=c(731888.0,7457552.0),cellsize=c(2,2),cells.dim=c(122,106))
grid <- SpatialPixelsDataFrame(grid,
data=data.frame(id=1:prod(122,106)),
proj4string=CRS("+proj=utm +ellps=WGS84 + datum=WGS84"))
plot(grid)
[see dropbox folder 'Grid.png']
bound <- shp#polygons
bound <- SpatialPolygons(bound, proj4string=CRS("+proj=utm +ellps=WGS84 +datum=WGS84"))
plot(bound)
[see dropbox folder 'Boundary plot.png']
clip_grid <- grid[!is.na(over(grid, bound)),]
No errors or warnings up to this point. But then...
plot(clip_grid)
Error in x#coords[i, , drop = FALSE] :
(subscript) logical subscript too long
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
or attempting to pass the object clip_grid through autokrige for the new_data argument:
PerInkrg <- autoKrige (PerArIn~1, hs1, clip_grid)
Error in predict.gstat(g, newdata = newdata, block = block, nsim = nsim, :
value not allowed for: %s %s newdata empty or only NA's
I've had no issues using the non-clipped grid (object = grid).
In a nutshell, I require this [see dropbox folder 'Autokrig plot'] but with the interpolated surfaced constrained (clipped) to the boundary extent of 'Torbay_Box2.shp'
P.S. I attempted to insert images of my plots and links to other posts I've used before asking for help here and a link to my data but as a new user I don't have enough reputation to do this - sorry!
Data and plots can be found on Dropbox.com/sh/yqg20z1ibl3h4aa/AACJnHoEuP-S5fTvAXxsnY1za?dl=0
I've now managed to produce an autoKrige [plot] which is masked to the extent of the Torbay_Box2 boundary. However, I never achieved this in the 'conventional' way by creating a prediction grid like meuse.grid. The result is the same so for now I'm happy but I would still like to do it the conventional way eventually.
Here's how I cheated it:
# Load sample box extent
bx.data <- readOGR (".", "Tobay_Box2")
bx <- spTransform(bx.data,"+proj=utm +ellps=WGS84 +datum=WGS84") #transformsto UTM projection
str(bx)
# Set the boundary extent with that of sample box extent
hs1#bbox <- bx#bbox
#create an empty grid
grd <- as.data.frame(spsample(hs1, "regular", n=50000))
names(grd) <- c("X", "Y")
coordinates(grd) <- c("X", "Y")
gridded(grd) <- TRUE # Create SpatialPixel object
fullgrid(grd) <- TRUE # Create SpatialGrid object
plot(hs1)
plot(grd, pch = ".", add = T)
proj4string(grd) <- proj4string(hs1)
I then performed an IDW interpolation using the empty grid as the newdata, converted the output to raster, clipped this to the Torbay_Box2 boundary and then converted this to a SpatialPixelDataFrame which I passed through as the new_data argument for autoKrige:
# For PerArIn (% area inhabited)
#interpolate the grid cells using all points and a power value of 2
hs1.idw <- gstat::idw(PerArIn ~ 1, hs1, newdata=grd, idp=2.0)
# Convert to raster object then clip to Hollicombe sample box
r <- raster(hs1.idw)
r.m <- mask(r, bx)
#Convert and set as prediction grid for Kriging
grd<- rasterToPoints(r.m, spatial=TRUE)
gridded(grd) <- TRUE
grd <- as (grd, "SpatialPixels")
#en voila!
PerInkrg <- autoKrige (PerArIn~1, hs1,grd)
I've created the following script to get the coastline of Denmark
# Get Shapefiles for Coastline
shpurl <- "http://download.geofabrik.de/europe/denmark-latest.shp.zip"
tmp <- tempfile(fileext=".zip")
download.file(shpurl, destfile = tmp)
files <- unzip(tmp, exdir=getwd())
# Load & plot shapefile
library(maptools)
shp <- readShapePoly(files[grep(".shp$", shpurl)])
plot(shp)
This should give me the outline of Denmark, however, I keep getting the following error:
Error in plot.window(...) : need finite 'ylim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
Any help or directions is appreciated.
Have a look at your files value:
> length(files)
[1] 41
The files you are downloading have number of shape files for various geographies. For instance the code:
require(rgdal)
shp <- readOGR(dsn = "whereIsavedyourStuff/Stacks", layer = "roads")
will execute properly. But you will need to specify, which of the sourced shape files you want to read.
As a side point, with respect to reading files from the net, I would suggest that you have a look at this code:
# Download an read US state shapefiles
tmp_shps <- tempfile(); tmp_dir <- tempdir()
download.file("http://www2.census.gov/geo/tiger/GENZ2014/shp/cb_2014_us_state_20m.zip",
tmp_shps)
unzip(tmp_shps, exdir = tmp_dir)
# Libs
require(rgdal)
# Read
us_shps <- readOGR(dsn = tmp_dir, layer = "cb_2014_us_state_20m")
Irrespectively, of what method to read shape files you decide to use you have to specify path and file name. In terms of the provided code in readOGR this is fulfilled by dns and layer options. In your code you used files[grep(".shp$", shpurl)], file names within your 195MB archive, do not corresponds to the URL. You have a few options here:
You can download this files and unpack as you did, list names of all the files that are *.shp, take the file names and pass them in the loop to a list where you would read all combinations (in effect you need a number of files to read each layer)
Better, specify the layer you want to read similarly to the code provided above.
I need to convert shapefiles into raster format.
I used the function "rasterize" in R package "raster", but the result does not look correct.
tst <- rasterize(shpfile, r, fun="count")
Found 5 region(s) and 5 polygon(s)
There is no gird with occurrence records:
range(tst[],na.rm=TRUE)
[1] Inf -Inf
Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
sum(tst[],na.rm=TRUE)
[1] 0
The R script that I wrote:
# download the GIS shape file
download.file("http://esp.cr.usgs.gov/data/little/abiebrac.zip",
destfile = "abiebrac.zip")
# unzip
unzip("abiebrac.zip")
# Read shapefile into R
library(maptools)
shpfile <- readShapePoly("abiebrac")
extent(shpfile)
# mapping
plot(shpfile)
library(maps)
map("world",xlim=c(-180,-50),ylim=c(7,83),add=FALSE)
plot(shpfile,add=TRUE,lwd=10)
# rasterize
library(raster)
GridSize <- 0.5 # degree
r <- raster(ncol= round(abs(-180-(-50))/GridSize),
nrow=round(abs(83-7)/GridSize))
extent(r) <- extent(c(-180, -50, 7, 83))
tst <- rasterize(shpfile, r, fun="count")
# summary
sum(tst[],na.rm=TRUE)
range(tst[],na.rm=TRUE)
# mapping
plot(tst,col="red",main="abiebrac")
map("world",xlim=c(-180,-50),ylim=c(7,83),add=TRUE)
I am not sure why you are using "count" in the fun argument but in this case, because there is no overlap, it is producing NA results. You also need to define an attribute field in the spatialPolygonDataFrame object to assign values to your raster. You can also pull the extent directly from the sp object.
This code seems to yield what you want.
require(raster)
require(rgdal)
require(sp)
setwd("D:/TMP")
shpfile <- readOGR(getwd(), "abiebrac")
r <- raster(extent(shpfile))
res(r)=0.05
r <- rasterize(shpfile, field="ABIEBRAC_", r)
plot(r)
plot(shpfile,lwd=10,add=TRUE)