Intersect in R - miss one polygon - r

1. The problem
I'm trying to extract the intersection of two polygons shapes in R. The first is the watershed polygon "ws_polygon_2", and the second is the Voronoi polygons of 5 rain gauges which was constructed from the Excel sheet "DATA.xlsx", both available here: link.
The code is the following:
#[1] Montagem da tabela de coordenadas dos postos pluviométricos
library(sp)
library(readxl)
dados_precipitacao_1985 <- read_excel(path="C:/Users/.../DATA.xlsx")
coordinates(dados_precipitacao_1985) <- ~ x + y
proj4string(dados_precipitacao_1985) <- CRS("+proj=longlat +datum=WGS84")
d_prec <- spTransform(dados_precipitacao_1985, CRSobj = "+init=epsg:3857")
#[2] Coleta dos dados espaciais da bacia hidrográfica
library(rgdal)
bacia_Caio_Prado <- readOGR(dsn="C:/Users/...", layer="ws_polygon_2")
bacia_WGS <- spTransform(bacia_Caio_Prado, CRSobj = "+proj=longlat +datum=WGS84")
bacia_UTM <- spTransform(bacia_Caio_Prado, CRSobj = "+init=epsg:3857")
#[3] Poligonos de Thiessen - 1 INTERPOLAÇÃO
library(dismo)
library(rgeos)
library(raster)
library(mapview)
limits_voronoi_WGS <- c(-40.00,-38.90,-5.00,-4.50)
v_WGS <- voronoi(dados_precipitacao_1985, ext=limits_voronoi_WGS)
bc <- aggregate(bacia_WGS)
u_WGS_1 <- gIntersection(spgeom1 = v_WGS, spgeom2 = bc,byid=TRUE)
u_WGS_2 <- intersect(bc, v_WGS)
When I apply the intersect function, the variable returned u_WGS_2 is a spatial polygon data frame with only 4 features, instead of 5. The Voronoi object v_WGS has 5 features as well.
By other hand, when I apply the gIntesection function, I get 5 features. However, the u_WGS_1 object is a spatial polygon only and I loss the rainfall data.
I'd like to know if I am committing any mistake or if there is any way to get the 5 features aggregated with the rainfall data in a spatial polygon data frame through the intersect function.
My objective is to transform this spatial polygon data frame with the rainfall data for each Voronoi polygon in a raster through the rasterize function later to compare with other interpolating results and satellite data.
Look these results. The first one is when I get the SPDF (Spatial Polygon Data Frame) I want, but missing the 5º feature. The second is the one I get with all the features I want, but missing the rainfall data.
spplot(u_WGS_2, 'JAN')
plot(u_WGS_1)
2. What I've tried
I look into the ws_polygon_2 shape searching for any other unwanted polygon who would pollute the shape and guide to this results. The shape is composed by only one polygon feature, the correct watershed feature.
I tried to use the aggregate function, as above, and as I saw in this tutorial. But I got the same result.
I tried to create a SPDF with de u_WGS_1 and the d_precSpatial Point Data Frame object. Actually, I'm working on it. And if it is the correct answer to my trouble, please help me with some code.
Thank you!

This is not an issue when using st_intersection() from sf, which retains the data from both data sets. Mind that dismo::voronoi() is compatible with sp objects only, so the precipitation data needs to be available in that format, at least temporarily. If you do not feel comfortable with sf and prefer to continue working with Spatial* objects after the actual intersection, simply invoke the as() method upon the output sf object as shown below.
library(sf)
#[1] Montagem da tabela de coordenadas dos postos pluviométricos
dados_precipitacao_1985 <- readxl::read_excel(path="data/DATA.xlsx")
dados_precipitacao_1985 <- st_as_sf(dados_precipitacao_1985, coords = c("x", "y"), crs = 4326)
dados_precipitacao_1985_sp <- as(dados_precipitacao_1985, "Spatial")
#[2] Coleta dos dados espaciais da bacia hidrográfica
bacia_Caio_Prado <- st_read(dsn="data/SHAPE_CORRIGIDO", layer="ws_polygon_2")
#[3] Poligonos de Thiessen - 1 INTERPOLAÇÃO
limits_voronoi_WGS <- c(-40.00,-38.90,-5.00,-4.50)
v_WGS <- dismo::voronoi(dados_precipitacao_1985_sp, ext=limits_voronoi_WGS)
v_WGS_sf <- st_as_sf(v_WGS)
u_WGS_3 <- st_intersection(bacia_Caio_Prado, v_WGS_sf)
plot(u_WGS_3[, 6], key.pos = 1)

The missing polygon is removed because it is invalid
library(raster)
bacia <- shapefile("SHAPE_CORRIGIDO/ws_polygon_2.shp")
rgeos::gIsValid(bacia)
#[1] FALSE
#Warning message:
#In RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid") :
# Ring Self-intersection at or near point -39.070555560000003 -4.8419444399999998
The self-intersection is here:
zoom(bacia, ext=extent(-39.07828, -39.06074, -4.85128, -4.83396))
points(cbind( -39.070555560000003, -4.8419444399999998))
Invalid polygons are removed as they are assumed to have been produced by intersect. In this case, the invalid data was already there and should have been retained. I will see if I can fix that.

Related

Convert a column value(s) in SpatialpolygonDataframe into raster image

I need help with converting a variable or column values in a spatial polygon into a raster image. I have spatial data of administrative units with income(mean) information for each unit. I want to convert this information into raster for further analysis.
I tried the code below but it didn't work.
r <- raster(ncol=5,nrow=15)
r.inc <- rasterize(DK,r,field=DK#data[,2],fun=mean)
Where SP is the spatial polygon and the mean income for each spatial unit stored in column 2 of the SpatialPolygonDataframe. Can anyone help with a function or code of how to rasterise the values in the column of interest? An example of the spatialpolygondataframe (created) and my attempt to rasterize the data are below
suppressPackageStartupMessages(library(tidyverse))
url = "https://api.dataforsyningen.dk/landsdele?format=geojson"
geofile = tempfile()
download.file(url, geofile)
DK <- rgdal::readOGR(geofile)
DK#data = subset(DK#data, select = c(navn))
DK#data$inc = runif(11, min=5000, max=80000)
require(raster)
r <- raster(ncol=5,nrow=15)
r.inc <- rasterize(DK,r,field=DK#data[,2],fun=mean)
plot(r.inc)
Thank you.
Acknowledgement: The code for creating the sample SPDF was sourced from Mikkel Freltoft Krogsholm (link below).
https://www.linkedin.com/pulse/easy-maps-denmark-r-mikkel-freltoft-krogsholm/?trk=read_related_article-card_title
Here's something that makes a raster.
library(tidyverse)
library(rgdal)
library(raster)
url <- "https://api.dataforsyningen.dk/landsdele?format=geojson"
geofile <- tempfile()
download.file(url, geofile)
DK <- rgdal::readOGR(geofile)
r_dk <- raster(DK, nrows = 100, ncols = 100) # Make a raster of the same size as the spatial polygon with many cells
DK$inc <- runif(nrow(DK), min=5000, max=80000) # Add some fake income data
rr <- rasterize(DK, r_dk, field='inc') # Rasterize the polygon into the raster - fun = 'mean' won't make any difference
plot(rr)
The original raster was the size of the whole Earth so I think Denmark was being averaged to nothing. I resolved this by making an empty raster based on the extent of the DK spatial polygons with 100x100 cells. I also simplified the code. Generally, if you find yourself using # with spatial data manipulation, it's a sign that there might be a simpler way. Because the resolution of the raster is much larger than the size of each DK region, taking the average doesn't make much difference.

Failing to assign projection to sp object SpatialPointsDataFrame

I have a SpatialPointsDataFrame called johnny, created from a vanilla dataframe by assigning coordinates. These coordinates are in coordinate system EPSG 4326 (the standard GPS geographic coordinate system), but johnny does not know that. So ,I am trying to assign EPSG 4326 to johnny, essentially as in this earlier question data projection in R using package SP . I, too, am using sp. My ultimate goal is to project johnny to projected_johnny. However, I can't seem to assign the existing projection correctly first. Who sees my mistake?
library(sp)
x <- seq(80,90,by=1)
y <- seq(40,50,by=1)
value <- seq(10,20,by=1)
johnny <- data.frame(cbind(x,y,value))
coordinates(johnny) <- ~x+y
class(johnny)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
proj4string(johnny) <- CRS("+init=epsg:4326")
Error in if (is.na(get("has_proj_def.dat", envir = .RGDAL_CACHE))) { :
argument is of length zero
I have considered and rejected the following possible solutions after trying them out:
Adding library rdgal directly
using CRS("+proj=longlat +datum=WGS84") instead of CRS("+init=epsg:4326")
I am using R 3.6.0 and sp 1.3-1. The rgdal version loaded via sp is 1.5-15. Any ideas are welcome. This should be such a simple action...
I looked over your code and guessed what you are probably trying to accomplish. But the way you are going about things is more different than it needs to be. There is a simple way to accomplished this. By far, the easiest way to accomplish this is by using those tools found in the R, sf package. Know that the sf package is a newer package than the sp package. And the sf package provides easy to use tools for accomplishing these tasks.
The code below is somewhat different from your code. A two column matrix was used instead of your three column data frame.
The simple feature geometry points were created from the matrix. Then the simple feature column object was created from the geometry points. Then the plot was created.
Code:
# Create matrix
x <- seq(80,90,by=1)
y <- seq(40,50,by=1)
# value <- seq(10,20,by=1)
#johnny <- data.frame(cbind(x,y))
jm <- matrix(data = c(x,y), nrow = 11, ncol = 2)
# coordinates(johnny) <- ~x+y
# class(johnny)
# johnny
Create sf multipoint geometry:
jm.sfg <- st_multipoint(jm)
jm.sfg
Create sf column object:
jm.sfc <- st_sfc(jm.sfg, crs = 4326)
jm.sfc
Plot
plot(jm.sfc, axes = TRUE)
The plot can be viewed from link below.

Create topographic map in R

I am trying to create a script that will generate a 2d topographic or contour map for a given set of coordinates. My goal is something similar to what is produced by
contour(volcano)
but for any location set by the user. This has proved surprisingly challenging! I have tried:
library(elevatr)
library(tidyr)
# Generate a data frame of lat/long coordinates.
ex.df <- data.frame(x=seq(from=-73, to=-71, length.out=10),
y=seq(from=41, to=45, length.out=10))
# Specify projection.
prj_dd <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
# Use elevatr package to get elevation data for each point.
df.sp <- get_elev_point(ex.df, prj = prj_dd, src = "epqs")
# Convert from spatial to regular data frame, remove extra column.
# Use tidyr to convert to lat x lon table with elevation as fill.
# Sorry for the terrible code, I know this is sloppy.
df <- as.data.frame(df.sp)
df$elev_units <- NULL
df.w <- df %>% spread(y, elevation)
df.w <- as.matrix(df.w)
This creates a matrix similar to the volcano dataset but filled with NAs except for the 10 lat/lon pairs with elevation data. contour can handle NAs, but the result of contour(df.w) has only a single tiny line on it. I'm not sure where to go from here. Do I simply need more points? Thanks in advance for any help--I'm pretty new to R and I think I've bitten off more than I can chew with this project.
Sorry for delay in responding. I suppose I need to check SO for elevatr questions!
I would use elevatr::get_elev_raster(), which returns a raster object which can be plotted directly with raster::contour().
Code example below grabs a smaller area and at a pretty coarse resolution. Resultant contour looks decent though.
library(elevatr)
library(raster)
# Generate a data frame of lat/long coordinates.
ex.df <- data.frame(x=seq(from=-73, to=-72.5, length.out=10),
y=seq(from=41, to=41.5, length.out=10))
# Specify projection.
prj_dd <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
# Use elevatr package to get elevation data for each point.
elev <- get_elev_raster(ex.df, prj = prj_dd, z = 10, clip = "bbox")
raster::contour(elev)
If it is a requirement to use graphic::contour(), you'll need to convert the raster object to a matrix first with raster::as.matrix(elev). That flips the coords though and I haven't spent enough time to try and get that part figured out... Hopefully the raster solution works for you.

Label a point depending on which polygon contains it (NYC civic geospatial data)

I have the longitude and latitude of 5449 trees in NYC, as well as a shapefile for 55 different Neighborhood Tabulation Areas (NTAs). Each NTA has a unique NTACode in the shapefile, and I need to append a third column to the long/lat table telling me which NTA (if any) each tree falls under.
I've made some progress already using other point-in-polygon threads on stackoverflow, especially this one that looks at multiple polygons, but I'm still getting errors when trying to use gContains and don't know how I could check/label each tree for different polygons (I'm guessing some sort of sapply or for loop?).
Below is my code. Data/shapefiles can be found here: http://bit.ly/1BMJubM
library(rgdal)
library(rgeos)
library(ggplot2)
#import data
setwd("< path here >")
xy <- read.csv("lonlat.csv")
#import shapefile
map <- readOGR(dsn="CPI_Zones-NTA", layer="CPI_Zones-NTA", p4s="+init=epsg:25832")
map <- spTransform(map, CRS("+proj=longlat +datum=WGS84"))
#generate the polygons, though this doesn't seem to be generating all of the NTAs
nPolys <- sapply(map#polygons, function(x)length(x#Polygons))
region <- map[which(nPolys==max(nPolys)),]
plot(region, col="lightgreen")
#setting the region and points
region.df <- fortify(region)
points <- data.frame(long=xy$INTPTLON10,
lat =xy$INTPTLAT10,
id =c(1:5449),
stringsAsFactors=F)
#drawing the points / polygon overlay; currently only the points are appearing
ggplot(region.df, aes(x=long,y=lat,group=group))+
geom_polygon(fill="lightgreen")+
geom_path(colour="grey50")+
geom_point(data=points,aes(x=long,y=lat,group=NULL, color=id), size=1)+
xlim(-74.25, -73.7)+
ylim(40.5, 40.92)+
coord_fixed()
#this should check whether each tree falls into **any** of the NTAs, but I need it to specifically return **which** NTA
sapply(1:5449,function(i)
list(id=points[i,]$id, gContains(region,SpatialPoints(points[i,1:2],proj4string=CRS(proj4string(region))))))
#this is something I tried earlier to see if writing a new column using the over() function could work, but I ended up with a column of NAs
pts = SpatialPoints(xy)
nyc <- readShapeSpatial("< path to shapefile here >")
xy$nrow=over(pts,SpatialPolygons(nyc#polygons), returnlist=TRUE)
The NTAs we're checking for are these ones (visualized in GIS): http://bit.ly/1A3jEcE
Try simply:
ShapeFile <- readShapeSpatial("Shapefile.shp")
points <- data.frame(long=xy$INTPTLON10,
lat =xy$INTPTLAT10,
stringsAsFactors=F)
dimnames(points)[[1]] <- seq(1, length(xy$INTPTLON10), 1)
points <- SpatialPoints(points)
df <- over(points, ShapeFile)
I omitted transformation of shapefile because this is not the main subject here.

Plot spatial area defined by multiple polygons

I have a SpatialPolygonsDataFrame with 11589 spatial objects of class "polygons". 10699 of those objects consists of exactly 1 polygon. However, the rest of those spatial objects consist of multiple polygons (2 to 22).
If an object of consists of multiple polygons, three scenarios are possible:
1) The additional polygons could describe a "hole" in the spatial area described by the first polygon .
2) The additional polygons could also describe additional geographic areas, i.e. the shape of the region is quite complex and described by putting together multiple parts.
3) Often it is a mix of both, 1) and 2).
My question is: How to plot such a spatial object which is based on multiple polygons?
I have been able to extract and plot the information of the first polygon, but I have not figured out how plot all polygons of such a complex spatial object at once.
Below you find my code. The problem is the 15th last line.
# Load packages
# ---------------------------------------------------------------------------
library(maptools)
library(rgdal)
library(ggmap)
library(rgeos)
# Get data
# ---------------------------------------------------------------------------
# Download shape information from the internet
URL <- "http://www.geodatenzentrum.de/auftrag1/archiv/vektor/vg250_ebenen/2012/vg250_2012-01-01.utm32s.shape.ebenen.zip"
td <- tempdir()
setwd(td)
temp <- tempfile(fileext = ".zip")
download.file(URL, temp)
unzip(temp)
# Get shape file
shp <- file.path(tempdir(),"vg250_0101.utm32s.shape.ebenen/vg250_ebenen/vg250_gem.shp")
# Read in shape file
x <- readShapeSpatial(shp, proj4string = CRS("+init=epsg:25832"))
# Transform the geocoding from UTM to Longitude/Latitude
x <- spTransform(x, CRS("+proj=longlat +datum=WGS84"))
# Extract relevant information
att <- attributes(x)
Poly <- att$polygons
# Pick an geographic area which consists of multiple polygons
# ---------------------------------------------------------------------------
# Output a frequency table of areas with N polygons
order.of.polygons.in.shp <- sapply(x#polygons, function(x) x#plotOrder)
length.vector <- unlist(lapply(order.of.polygons.in.shp, length))
table(length.vector)
# Get geographic area with the most polygons
polygon.with.max.polygons <- which(length.vector==max(length.vector))
# Check polygon
# x#polygons[polygon.with.max.polygons]
# Get shape for the geographic area with the most polygons
### HERE IS THE PROBLEM ###
### ONLY information for the first polygon is extracted ###
Poly.coords <- data.frame(slot(Poly[[polygon.with.max.polygons ]]#Polygons[[1]], "coords"))
# Plot
# ---------------------------------------------------------------------------
## Calculate centroid for the first polygon of the specified area
coordinates(Poly.coords) <- ~X1+X2
proj4string(Poly.coords) <- CRS("+proj=longlat +datum=WGS84")
center <- gCentroid(Poly.coords)
# Download a map which is centered around this centroid
al1 = get_map(location = c(lon=center#coords[1], lat=center#coords[2]), zoom = 10, maptype = 'roadmap')
# Plot map
ggmap(al1) +
geom_path(data=as.data.frame(Poly.coords), aes(x=X1, y=X2))
I may be misinterpreting your question, but it's possible that you are making this much harder than necessary.
(Note: I had trouble dealing with the .zip file in R, so I just downloaded and unzipped it in the OS).
library(rgdal)
library(ggplot2)
setwd("< directory with shapefiles >")
map <- readOGR(dsn=".", layer="vg250_gem", p4s="+init=epsg:25832")
map <- spTransform(map, CRS("+proj=longlat +datum=WGS84"))
nPolys <- sapply(map#polygons, function(x)length(x#Polygons))
region <- map[which(nPolys==max(nPolys)),]
plot(region, col="lightgreen")
# using ggplot...
region.df <- fortify(region)
ggplot(region.df, aes(x=long,y=lat,group=group))+
geom_polygon(fill="lightgreen")+
geom_path(colour="grey50")+
coord_fixed()
Note that ggplot does not deal with the holes properly: geom_path(...) works fine, but geom_polygon(...) fills the holes. I've had this problem before (see this question), and based on the lack of response it may be a bug in ggplot. Since you are not using geom_polygon(...), this does not affect you...
In your code above, you would replace the line:
ggmap(al1) + geom_path(data=as.data.frame(Poly.coords), aes(x=X1, y=X2))
with:
ggmap(al1) + geom_path(data=region.df, aes(x=long,y=lat,group=group))

Resources