How to clip a polygon shapefile by another polygon shapefile in R? - r

I have two polygons shapefiles and I'd like to clip one by the other. I search on google but I could find only clipping by a bounding box or clipping points by polygons, and that's not what I need.
I also find something in other programming languague, except in R (http://rosettacode.org/wiki/Sutherland-Hodgman_polygon_clipping#Python).
Could you help me?
Thanks
Tiago

Although, the question is very old, providing a good answer might help anyone in the future landing on this page.
I think what you're trying to do is straight forward. To illustrate, lets assume i'm interested in eastern coastline of Saudi Arabia (SA) and i have a shape file that has the east and west coast of SA [ and another shapefile of the Gulf (prominent waterbody on the east coast of SA). We need the SF package to crop the two shape files
Load the SF package
library(sf)
Then load both shapefiles
ksa <- st_read("cropping_shape_file/ksa/saudi_arabia_coastline.shp")
gulf <- st_read("cropping_shape_file/gulf/ihoPolygon.shp")%>%
st_transform(st_crs(ksa)) # The st_transform added to this file ensure that both files have same CRS ssstem otherwise it will be impossible to crop.
you can also check if their CRS is same
st_crs(ksa)==st_crs(gulf)
The sf_read outputs shapefile information into three fields but we're interested in just the geometry
you can
cropped<-st_crop( ksa$geometry, gulf$geometry)
plot(cropped)
You should have an image like
i'm providng the two shape files here https://mega.nz/file/NnxHXajI#KOK_86gwVQGfjNs7X2HOAtvG2NOkIwcU5EhzQU5X6tg, note that a shape file has four component that needs to be kept together(.shp, .prj, .shx, .dbf )

Another way for clipping is to make an selection like "polygontoclip"["templatepolygon", ], found this for points (http://robinlovelace.net/r/2014/07/29/clipping-with-r.html), but also works with polygons.

This doesn't answer your question, but since you mention clipping by bounding box, so this post comes up in search strings:
From r-bloggers: Clipping by a bounding box
setwd("../")
library(rgdal)
zones <- readOGR("data", "london_sport")
make a bounding box:
b <- bbox(zones)
b[1, ] <- (b[1, ] - mean(b[1, ])) * 0.5 + mean(b[1, ])
b[2, ] <- (b[2, ] - mean(b[2, ])) * 0.5 + mean(b[2, ])
b <- bbox(t(b))
plot(zones, xlim = b[1, ], ylim = b[2, ])
Using a custom function:
library(raster)
library(rgeos)
## rgeos version: 0.3-5, (SVN revision 447)
## GEOS runtime version: 3.4.2-CAPI-1.8.2 r3921
## Polygon checking: TRUE
gClip <- function(shp, bb){
if(class(bb) == "matrix") b_poly <- as(extent(as.vector(t(bb))), "SpatialPolygons")
else b_poly <- as(extent(bb), "SpatialPolygons")
gIntersection(shp, b_poly, byid = T)
}
zones_clipped <- gClip(zones, b)
## Warning: spgeom1 and spgeom2 have different proj4 strings
plot(zones_clipped)
Note that due to the if statements in gClip’s body, it can handle almost any spatial data input, and still work.
westminster <- zones[grep("West", zones$name),]
zones_clipped_w <- gClip(zones, westminster)
## Warning: spgeom1 and spgeom2 have different proj4 strings
plot(zones_clipped_w); plot(westminster, col = "red", add = T)

Clip a shapefile by another shapefile
library(raster)
# library(rgdal) # if require
# library(rgeos) # if require
luxembourg <- shapefile(system.file("external/lux.shp", package="raster")) ### load a shape from raster package
plot(luxembourg)
another_shape <- as(extent(6, 6.4, 49.75, 50), 'SpatialPolygons') ### draw a simple shape / load your own
plot(another_shape, add=T, border="red", lwd=3)
luxembourg_clip <- crop(luxembourg, another_shape) ### crop (SpatialPolygon) luxembourg with another_shape
plot(luxembourg_clip, add=T, col=4) ### plot on
plot(luxembourg_clip, add=F, col=3) ### just plot
See some images: Map and Clip
Source: https://rdrr.io/cran/raster/man/crop.html

Related

R raster::crop() The upper boundary of my cropped raster is always horizontal- why?

I'm trying to crop a large multipolygon shapefile by a single, smaller polygon. It works using st_intersection, however this takes a very long time, so I'm instead trying to convert the multipolygon to a raster, and crop that raster by the smaller polygon.
## packages - sorry if I've missed any!
library(raster)
library(rgdal)
library(fasterize)
library(sf)
## load files
shp1 <- st_read("pathtoshp", crs = 27700) # a large multipolygon shapefile to crop
### image below created using ggplot- ignore the black boundaries!
shp2 <- st_read("pathtoshp", crs = 27700) # a single, smaller polygon shapefile, to crop shp1 by
plot(shp2)
## convert to raster (faster than st_intersection)
projection1 <- CRS('+init=EPSG:27700')
rst_template <- raster(ncols = 1000, nrows = 1000,
crs = projection1,
ext = extent(shp1))
rst_shp1 <- fasterize(shp1, rst_template)
plot(rst_shp1)
rst_shp2 <- crop(rst_shp1, shp2)
plot(rst_shp2)
When I plot shp2, the upper boundary is flat, rather than fitting the true boundary of the shp2 polygon.
Any help would be greatly appreciated!
Maybe try raster::mask() instead of crop(). crop() uses the second argument as an extent with which to crop a raster; i.e. it's taking the bounding box (extent) of your second argument and cropping that entire rectangle from your raster.
Something important to understand about raster objects is that they are all rectangular. The white space you see surrounding your shape are just NA values.
raster::mask() will take your original raster, and a spatial object (raster, sf, etc.) and replace all values in your raster which don't overlap with your spatial object to NA (by default, you can supply other replacement values). Though I will say, mask() will likely also take awhile to run, so you may be better off just sticking with sf objects.
I would suggest moving to the "terra" package (faster and easier to use than "raster").
Here is an example.
library(terra)
r <- rast(system.file("ex/elev.tif", package="terra"))
v <- vect(system.file("ex/lux.shp", package="terra"))[4]
x <- crop(r, v)
plot(x); lines(v)
As edixon1 points out, a raster is always rectangular. If you want to set cells outside of the polygon to NA, you can do
x <- crop(r, v, mask=TRUE)
plot(x); lines(v)
In this example it makes no sense, but you could first rasterize
x <- crop(r, v)
y <- rasterize(v, x)
m <- mask(x, y)
plot(m); lines(v)
I am not sure if this answers your question. But if it does not, then please edit your question to make it reproducible, for example using the example data above.

Can I clip an area around a shapefile in R by specifying the coordinates?

I have a map of landcover data in R and want to clip a circle of specific area, say 20km, and extract the resultant circular shapefile.
# read in the shape file, assign the CRS and plot it
area <- readShapePoly("Corrine Land Use ITM Projection - Copy.shp", proj4string = CRS("+init=epsg:2157"))
plot(area, xlim = c(560000,600000), ylim = c(530000,580000), axes=TRUE)
# create a dataframe of the location where the buffer should be made and plot it
locations<-data.frame(latitude=584503.3,longitude = 560164.5)
points(locations, bg='tomato2', pch=21, cex=3)
Do I need to change my points into a coordinate system first before I do this?
The shape file is the Corine Landcover 2012 - National http://gis.epa.ie/GetData/Download
Thanks
Your polygons
area <- shapefile("Corrine Land Use ITM Projection - Copy.shp")
You can create a circle (or multiple circles) like this:
library(dismo)
p <- polygons(circles(cbind(0,0), sqrt(20000 / pi), lonlat=FALSE, dissolve=FALSE))
crs(p) <- crs(area)
Intersect
int <- crop(area, p)
Write
shapefile(int, 'landcover_circle.shp')
#Manassa: I believe you will need to ensure the shapefile and raster are in the same projection before you clip, then you can use the crop function in the raster library. Please note, the output will be a clipped raster, not a shapefile as stated in your original question.
# Reproject shapefile to same projection as the raster
#shp = shapefile
#r = raster
library(rdgal)
shp.reproject <- spTransform(shp, crs(r))
#crop raster with polygon
library(raster)
r.crop <- crop(r, shp.reproject)

How to properly project and plot raster in R

I have a raster in an equal area Behrmann projection and I would like to project it to the Mollweide projection and plot.
When I do this with the following code, however, the plotting doesn't seem right, as the map extends to the sides, and there are outlines of various landmasses where I wouldn't expect them.Also, the map extends beyond the plot window.
Can anyone please help me get this to plot nicely?
Thanks!
The data file used can be downloaded from this link.
Here is the code I have so far:
require(rgdal)
require(maptools)
require(raster)
data(wrld_simpl)
mollCRS <- CRS('+proj=moll')
behrmannCRS <- CRS('+proj=cea +lat_ts=30')
sst <- raster("~/Dropbox/Public/sst.tif", crs=behrmannCRS)
sst_moll <- projectRaster(sst, crs=mollCRS)
wrld <- spTransform(wrld_simpl, mollCRS)
plot(sst_moll)
plot(wrld, add=TRUE)
Alright, since the example at this page seems to work, I tried to mimic it as much as possible. I think problems arise because the far left and far right side of the raster image overlap. Cropping and an intermediate reprojection to Lat-Lon as in the example seem to solve your problem.
Perhaps this workaround can be a basis for a more elegant solution that directly addresses the problem, as it is not benificial to reproject a raster twice.
# packages
library(rgdal)
library(maptools)
library(raster)
# define projections
mollCRS <- CRS('+proj=moll')
behrmannCRS <- CRS('+proj=cea +lat_ts=30')
# read data
data(wrld_simpl)
sst <- raster("~/Downloads/sst.tif", crs=behrmannCRS)
# crop sst to extent of world to avoid overlap on the seam
world_ext = projectExtent(wrld_simpl, crs = behrmannCRS)
sst_crop = crop(x = sst, y=world_ext, snap='in')
# convert sst to longlat (similar to test file)
# somehow this gets rid of the unwanted pixels outside the ellipse
sst_longlat = projectRaster(sst_crop, crs = ('+proj=longlat'))
# then convert to mollweide
sst_moll <- projectRaster(sst_longlat, crs=mollCRS, over=T)
wrld <- spTransform(wrld_simpl, mollCRS)
# plot results
plot(sst_moll)
plot(wrld, add=TRUE)

Plot spatial area defined by multiple polygons

I have a SpatialPolygonsDataFrame with 11589 spatial objects of class "polygons". 10699 of those objects consists of exactly 1 polygon. However, the rest of those spatial objects consist of multiple polygons (2 to 22).
If an object of consists of multiple polygons, three scenarios are possible:
1) The additional polygons could describe a "hole" in the spatial area described by the first polygon .
2) The additional polygons could also describe additional geographic areas, i.e. the shape of the region is quite complex and described by putting together multiple parts.
3) Often it is a mix of both, 1) and 2).
My question is: How to plot such a spatial object which is based on multiple polygons?
I have been able to extract and plot the information of the first polygon, but I have not figured out how plot all polygons of such a complex spatial object at once.
Below you find my code. The problem is the 15th last line.
# Load packages
# ---------------------------------------------------------------------------
library(maptools)
library(rgdal)
library(ggmap)
library(rgeos)
# Get data
# ---------------------------------------------------------------------------
# Download shape information from the internet
URL <- "http://www.geodatenzentrum.de/auftrag1/archiv/vektor/vg250_ebenen/2012/vg250_2012-01-01.utm32s.shape.ebenen.zip"
td <- tempdir()
setwd(td)
temp <- tempfile(fileext = ".zip")
download.file(URL, temp)
unzip(temp)
# Get shape file
shp <- file.path(tempdir(),"vg250_0101.utm32s.shape.ebenen/vg250_ebenen/vg250_gem.shp")
# Read in shape file
x <- readShapeSpatial(shp, proj4string = CRS("+init=epsg:25832"))
# Transform the geocoding from UTM to Longitude/Latitude
x <- spTransform(x, CRS("+proj=longlat +datum=WGS84"))
# Extract relevant information
att <- attributes(x)
Poly <- att$polygons
# Pick an geographic area which consists of multiple polygons
# ---------------------------------------------------------------------------
# Output a frequency table of areas with N polygons
order.of.polygons.in.shp <- sapply(x#polygons, function(x) x#plotOrder)
length.vector <- unlist(lapply(order.of.polygons.in.shp, length))
table(length.vector)
# Get geographic area with the most polygons
polygon.with.max.polygons <- which(length.vector==max(length.vector))
# Check polygon
# x#polygons[polygon.with.max.polygons]
# Get shape for the geographic area with the most polygons
### HERE IS THE PROBLEM ###
### ONLY information for the first polygon is extracted ###
Poly.coords <- data.frame(slot(Poly[[polygon.with.max.polygons ]]#Polygons[[1]], "coords"))
# Plot
# ---------------------------------------------------------------------------
## Calculate centroid for the first polygon of the specified area
coordinates(Poly.coords) <- ~X1+X2
proj4string(Poly.coords) <- CRS("+proj=longlat +datum=WGS84")
center <- gCentroid(Poly.coords)
# Download a map which is centered around this centroid
al1 = get_map(location = c(lon=center#coords[1], lat=center#coords[2]), zoom = 10, maptype = 'roadmap')
# Plot map
ggmap(al1) +
geom_path(data=as.data.frame(Poly.coords), aes(x=X1, y=X2))
I may be misinterpreting your question, but it's possible that you are making this much harder than necessary.
(Note: I had trouble dealing with the .zip file in R, so I just downloaded and unzipped it in the OS).
library(rgdal)
library(ggplot2)
setwd("< directory with shapefiles >")
map <- readOGR(dsn=".", layer="vg250_gem", p4s="+init=epsg:25832")
map <- spTransform(map, CRS("+proj=longlat +datum=WGS84"))
nPolys <- sapply(map#polygons, function(x)length(x#Polygons))
region <- map[which(nPolys==max(nPolys)),]
plot(region, col="lightgreen")
# using ggplot...
region.df <- fortify(region)
ggplot(region.df, aes(x=long,y=lat,group=group))+
geom_polygon(fill="lightgreen")+
geom_path(colour="grey50")+
coord_fixed()
Note that ggplot does not deal with the holes properly: geom_path(...) works fine, but geom_polygon(...) fills the holes. I've had this problem before (see this question), and based on the lack of response it may be a bug in ggplot. Since you are not using geom_polygon(...), this does not affect you...
In your code above, you would replace the line:
ggmap(al1) + geom_path(data=as.data.frame(Poly.coords), aes(x=X1, y=X2))
with:
ggmap(al1) + geom_path(data=region.df, aes(x=long,y=lat,group=group))

Check if point is in spatial object which consists of multiple polygons/holes

I have a SpatialPolygonsDataFrame with 11589 objects of class "polygons". 10699 of those objects consists of exactly 1 polygon, however the rest of those objects consists of multiple polygons (2 to 22).
If an object of consists of multiple polygons, three scenarios are possible:
Sometimes, those additional polygons describe a "hole" in the geographic ara describe by the first polygon in the object of class "polygons".
Sometimes, those additional polygons describe additional geographic areas, i.e. the shape of the region is quite complex and described by putting together multiple parts.
Sometimes, it might be a mix of both, 1) and 2).
Stackoverflow helped me to plot such an spatial object properly (Plot spatial area defined by multiple polygons).
However, I am still not able to answer how to determine whether a point (defined by longitude/latitude) is in a polygon.
Below is my code. I tried to apply the function point.in.polygon in the sp package, but found no way how it could handle such an object which consists of multiple polygons/holes.
# Load packages
# ---------------------------------------------------------------------------
library(maptools)
library(rgdal)
library(rgeos)
library(ggplot2)
library(sp)
# Get data
# ---------------------------------------------------------------------------
# Download shape information from the internet
URL <- "http://www.geodatenzentrum.de/auftrag1/archiv/vektor/vg250_ebenen/2012/vg250_2012-01-01.utm32s.shape.ebenen.zip"
td <- tempdir()
setwd(td)
temp <- tempfile(fileext = ".zip")
download.file(URL, temp)
unzip(temp)
# Get shape file
shp <- file.path(tempdir(),"vg250_0101.utm32s.shape.ebenen/vg250_ebenen/vg250_gem.shp")
# Read in shape file
map <- readShapeSpatial(shp, proj4string = CRS("+init=epsg:25832"))
# Transform the geocoding from UTM to Longitude/Latitude
map <- spTransform(map, CRS("+proj=longlat +datum=WGS84"))
# Pick an geographic area which consists of multiple polygons
# ---------------------------------------------------------------------------
# Output a frequency table of areas with N polygons
nPolys <- sapply(map#polygons, function(x)length(x#Polygons))
# Get geographic area with the most polygons
polygon.with.max.polygons <- which(nPolys==max(nPolys))
# Get shape for the geographic area with the most polygons
Poly.coords <- map[which(nPolys==max(nPolys)),]
# Plot
# ---------------------------------------------------------------------------
# Plot region without Google maps (ggplot2)
plot(Poly.coords, col="lightgreen")
# Find if a point is in a polygon
# ---------------------------------------------------------------------------
# Define points
points_of_interest <- data.frame(long=c(10.5,10.51,10.15,10.4),
lat =c(51.85,51.72,51.81,51.7),
id =c("A","B","C","D"), stringsAsFactors=F)
# Plot points
points(points_of_interest$long, points_of_interest$lat, pch=19)
You can do this simply with gContains(...) in the rgeos package.
gContains(sp1,sp2)
returns a logical depending on whether sp2 is contained within sp1. The only nuance is that sp2 has to be a SpatialPoints object, and it has to have the same projection as sp1. To do that, you would do something like this:
point <- data.frame(lon=10.2, lat=51.7)
sp2 <- SpatialPoints(point,proj4string=CRS(proj4string(sp1)))
gContains(sp1,sp2)
Here is a working example based on the answer to your previous question.
library(rgdal) # for readOGR(...)
library(rgeos) # for gContains(...)
library(ggplot2)
setwd("< directory with all your files >")
map <- readOGR(dsn=".", layer="vg250_gem", p4s="+init=epsg:25832")
map <- spTransform(map, CRS("+proj=longlat +datum=WGS84"))
nPolys <- sapply(map#polygons, function(x)length(x#Polygons))
region <- map[which(nPolys==max(nPolys)),]
region.df <- fortify(region)
points <- data.frame(long=c(10.5,10.51,10.15,10.4),
lat =c(51.85,51.72,51.81,51.7),
id =c("A","B","C","D"), stringsAsFactors=F)
ggplot(region.df, aes(x=long,y=lat,group=group))+
geom_polygon(fill="lightgreen")+
geom_path(colour="grey50")+
geom_point(data=points,aes(x=long,y=lat,group=NULL, color=id), size=4)+
coord_fixed()
Here, point A is in the main polygon, point B is in a lake (hole), point C is on an island, and point D is completely outside the region. So this code checks all of the points using gContains(...)
sapply(1:4,function(i)
list(id=points[i,]$id,
gContains(region,SpatialPoints(points[i,1:2],proj4string=CRS(proj4string(region))))))
# [,1] [,2] [,3] [,4]
# id "A" "B" "C" "D"
# TRUE FALSE TRUE FALSE
Since you can use the "point in polygon" routine, and this apparently isn't already suitably designed to handle the multi-polygon case in R (which I find a bit odd actually), you are left with having to cycle through each of the multiple polygons. Now the trick is, if you are inside an odd number of polygons, you are inside the multi-polygon. If you are inside an even number of polygons, then you are actually outside of the shape.
Point in polygon testing that uses ray-crossings should ALREADY be able to handle this, just by making sure you pass in all the vertices to the original point.in.polygon test, but I am not sure which mechanism R is using, so I can only give you the even/odd advice above.
I also found this code, not sure if it will help:
require(sp)
require(rgdal)
require(maps)
# read in bear data, and turn it into a SpatialPointsDataFrame
bears <- read.csv("bear-sightings.csv")
coordinates(bears) <- c("longitude", "latitude")
# read in National Parks polygons
parks <- readOGR(".", "10m_us_parks_area")
# tell R that bear coordinates are in the same lat/lon reference system
# as the parks data -- BUT ONLY BECAUSE WE KNOW THIS IS THE CASE!
proj4string(bears) <- proj4string(parks)
# combine is.na() with over() to do the containment test; note that we
# need to "demote" parks to a SpatialPolygons object first
inside.park <- !is.na(over(bears, as(parks, "SpatialPolygons")))
# what fraction of sightings were inside a park?
mean(inside.park)
## [1] 0.1720648
# use 'over' again, this time with parks as a SpatialPolygonsDataFrame
# object, to determine which park (if any) contains each sighting, and
# store the park name as an attribute of the bears data
bears$park <- over(bears, parks)$Unit_Name
# draw a map big enough to encompass all points (but don't actually plot
# the points yet), then add in park boundaries superimposed upon a map
# of the United States
plot(coordinates(bears), type="n")
map("world", region="usa", add=TRUE)
plot(parks, border="green", add=TRUE)
legend("topright", cex=0.85,
c("Bear in park", "Bear not in park", "Park boundary"),
pch=c(16, 1, NA), lty=c(NA, NA, 1),
col=c("red", "grey", "green"), bty="n")
title(expression(paste(italic("Ursus arctos"),
" sightings with respect to national parks")))
# now plot bear points with separate colors inside and outside of parks
points(bears[!inside.park, ], pch=1, col="gray")
points(bears[inside.park, ], pch=16, col="red")
# write the augmented bears dataset to CSV
write.csv(bears, "bears-by-park.csv", row.names=FALSE)
# ...or create a shapefile from the points
writeOGR(bears, ".", "bears-by-park", driver="ESRI Shapefile")

Resources