How to dissolve separated polygons into a large one? - r

I have a shape file that I can read like this in R:
library(rgdal)
shape <- readOGR(dsn = "~/path", layer = "a")
I am interested in the whole region that cover all polygons (black curve here). How to dissolve all polygons even those separated into one polygon like this?
I am open to solutions from R or Qgis

Using R & the sf package you can make a convex hull of the unioned (if necessary) shapefile. Since you haven't included data, I've used the nc data included with the sf package to illustrate the method.
library(dplyr)
library(sf)
library(ggplot2)
# setting up sample data,
# you'll need to use st_read() to read your shapefile, not readOGR()
nc <- st_read(system.file("shape/nc.shp", package="sf"))
#> Reading layer `nc' from data source
#> `.../sf/shape/nc.shp'
#> using driver `ESRI Shapefile'
#> Simple feature collection with 100 features and 14 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> Geodetic CRS: NAD27
nc <- nc[c(1:30, 85:81),] #Use some non-contiguous counties
# make a convex hull of the unioned geometries
nc_hull <- st_convex_hull(st_union(nc))
ggplot() +
geom_sf(data = nc, fill = NA, color = 'red') +
geom_sf(data = nc_hull, fill = NA, color = 'black')
Created on 2022-03-18 by the reprex package (v2.0.1)

Related

Create random points over a line spatvector in R

I have a spatVector composed of a single-line geometry that covers the entire road network of my study area.
I would like to create a set of N random points over this geometry. I know how to do it in QGIS but I want to do it in R since I have to iterate this process 1'000 times and I want to create a loop.
Do you know any function to do this?
EDIT
First of all, I read my line shapefile using:
Road_network <- vect("path/to/file.shp)
Then I converted it into an SF object:
Road_network_SF <- st_as_sf(Road_network)
And finally, I use both the st_sample, getting the following results:
Random_points <- st_sample(Road_network_SF, size = 1799)
Random_points
Geometry set for 46350 features (with 44694 geometries empty)
Geometry type: MULTIPOINT
Dimension: XY
Bounding box: xmin: 4503139 ymin: 2504751 xmax: 4622797 ymax: 2613276
Projected CRS: ETRS89-extended / LAEA Europe
First 5 geometries:
MULTIPOINT EMPTY
MULTIPOINT EMPTY
MULTIPOINT EMPTY
MULTIPOINT ((4503139 2574957))
MULTIPOINT EMPTY
and the st_line_sample function, getting the following error:
Random_points <- st_line_sample(Road_network_SF, n = 1799)
Error in st_line_sample(Road_network_SF, n = 1799) :
inherits(x, "sfc_LINESTRING") non è TRUE
When I converted the spatVector to an sf object, this is what I get:
Road_network_SF
Simple feature collection with 1 feature and 2 fields
Geometry type: MULTILINESTRING
Dimension: XY
Bounding box: xmin: 4500176 ymin: 2504157 xmax: 4626207 ymax: 2616041
Projected CRS: ETRS89-extended / LAEA Europe
FURTHER EDIT
The workflow proposed by #Gregory work really good, my error was due to a problem with the road shapefile. I changed it and no further problems occurred, thank you!
Thanks in advance!
You can sample random points along a vector geometry (like roads) with sf::st_sample(), however the results might seem confusing depending on how you look at them. Here's a reproducible example.
library(sf, quietly = TRUE)
#> Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(tigris, quietly = TRUE)
#> To enable
#> caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
library(ggplot2)
suppressMessages(
roads <- roads(state = "NC",
county = "Mecklenburg")
)
set.seed(1)
rpoints <- st_sample(roads, size = 5)
#> although coordinates are longitude/latitude, st_sample assumes that they are
#> planar
ggplot() +
geom_sf(data = roads, color = "grey") +
geom_sf(data = rpoints, color = "black")
We see on the map that we have generated 5 random points, as intended. Surprisingly, if you examine the structure of the rpoints object you'll see that it is a multipoint of length 21672, which you might think is the number of points. However, all but 5 of them have empty geometries. The reason is that there is a geometry (empty for most) for each of the objects that makes up the roads vector.
str(rpoints)
#> sfc_MULTIPOINT of length 21672; first list element: 'XY' num[0 , 1:2] MULTIPOINT EMPTY
head(rpoints)
#> Geometry set for 6 features (with 6 geometries empty)
#> Geometry type: MULTIPOINT
#> Dimension: XY
#> Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
#> Geodetic CRS: NAD83
#> First 5 geometries:
#> MULTIPOINT EMPTY
#> MULTIPOINT EMPTY
#> MULTIPOINT EMPTY
#> MULTIPOINT EMPTY
#> MULTIPOINT EMPTY
Here's how to get the real points out of there.
rpoints <- rpoints[!st_is_empty(rpoints)]
rpoints
#> Geometry set for 5 features
#> Geometry type: MULTIPOINT
#> Dimension: XY
#> Bounding box: xmin: -81.01691 ymin: 35.07471 xmax: -80.62246 ymax: 35.2948
#> Geodetic CRS: NAD83
#> MULTIPOINT ((-80.88764 35.2948))
#> MULTIPOINT ((-80.62246 35.18395))
#> MULTIPOINT ((-81.01691 35.07471))
#> MULTIPOINT ((-80.78909 35.12663))
#> MULTIPOINT ((-80.83055 35.16959))
Created on 2023-02-01 by the reprex package (v2.0.1)

Smoothing polygons on map with ggplot2 and sf

How can you smooth the polygons of a map produced with ggplot and sf?
I have used the sf package to extract the polygons from a shapefile
geomunicipios <- st_read("ruta/archivo.shp")
Reading layer `archivo' from data source
`ruta\archivo.shp'
using driver `ESRI Shapefile'
Simple feature collection with 45 features and 10 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -2.344411 ymin: 37.37375 xmax: -0.647983 ymax: 38.75509
Geodetic CRS: WGS 84
And ggplot2 to plot the map:
rmurcia <- ggplot(data = geomunicipios) +
geom_sf(aes(fill=columna),color="#FFFFFF",size=1)
To perform the smoothing of the polygons I have analyzed three alternatives:
i. package "smoothr":
geosmunicipios <- smooth(geomunicipios, method = "ksmooth", smoothness = 12)
ii. package "rmapshaper": geosmunicipios <- ms_simplify(geomunicipios, keep = 0.02500, weighting = 12)
iii. package "sf": geosmunicipios <- st_simplify(geomunicipios, dTolerance = 50, preserveTopology = TRUE)
You have to try different values of the parameters to adjust to the needs and obtain the desired result.
To reproduce the case, the download can be done from: centrodedescargas.cnig.es/CentroDescargas/index.jsp
And follow the links:
Información geográfica de referencia - Límites municipales, provinciales y autonómicos - Descargar: lineas_limite.zip.
And the path in the uncompressed folder:
SIGLIM_Publico_INSPIRE - SHP_ETRS89 - recintos_municipales_inspire_peninbal_etrs89 - recintos_municipales_inspire_peninbal_etrs89.shp
Finally, for this case I have chosen to use rmapshaper, it produces a satisfactory result with a reduced size of the .pdf file, where I include the graphic.

How to read & plot from .shp files using sf package in r?

I am new to geospatial data & trying to plot using .shp file but getting an error.
The geometry type in this .shp file is LINESTRING which seems to be different from MULTIPOLYGON which I have plotted before using sf
shape file source: https://github.com/johnsnow09/covid19-df_stack-code/blob/main/in_country_boundaries.shp
original source of shapefile: https://github.com/wri/wri-bounds/blob/master/dist/in_countries.zip
Expecting result:
code attempt:
library(tidyverse)
library(sf)
ind_global <- sf::read_sf("path/in_country_boundaries.shp")
ind_global
output
Simple feature collection with 412 features and 9 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: -141.0056 ymin: -54.88624 xmax: 140.9776 ymax: 70.07531
Geodetic CRS: WGS 84
ind_global %>%
st_as_sf() %>%
ggplot() +
geom_sf()
Error in st_cast.POINT(X[[i]], ...) : cannot create MULTILINESTRING
from POINT
Do I need to handle LINESTRING geometry .shp file in some other way?
The code is running fine after removing st_as_sf() %>%. I have downloaded the shapefile from https://github.com/wri/wri-bounds/blob/master/dist/in_countries.zip and it is MULTIPOLYGON only.
library(tidyverse)
library(sf)
ind_global <- sf::read_sf("...\\in_countries.shp")
ind_global
ind_global %>%
ggplot() +
geom_sf()

Crating Kernel density estimate for polygon in R

I Have a shapefile of polygons and another one of points that are distributed over the polygons. I would like to create a kernel density estimate for each polygon based on the points it contains. unfortunately I was only able to create squared KDEs with the kde2d function from the MASS package. I would like the KDEs to be shaped as the polygons.
Any suggestions?
kde1 <- kde2d(poly$X, poly$Y, n=100,)
enter image description here
You can use the spatstat package for this. Here is an example of reading
in a shapefile from sf, generating random points and run kernel density
estimation of the intensity of points (points per unit area):
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
nc <- st_read(system.file("shape/nc.shp", package="sf"))
#> Reading layer `nc' from data source `/usr/lib/R/site-library/sf/shape/nc.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 100 features and 14 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> geographic CRS: NAD27
nc_flat <- st_transform(nc, crs = 26917)
W <- as.owin(nc_flat$geometry[1]) # First county of North Carolina data set in spatstat format
library(spatstat)
X <- runifpoint(100, win = W)
plot(X, "Random points")
D <- density(X)
plot(D, main = "KDE")
OK! I managed to use my own points by using the 'ppp' function from the spatstat package.
C <- as.owin(polygon$geometry[n])
p<- ppp(points$X,points$Y, window = C)
D <- density(p)
[enter image description here][1]
[1]: https://i.stack.imgur.com/YZN0V.png

create density raster and extract sum by polygon feature

I have a polygon (zones) and a set of coordinates (points). I'd like to create a spatial kernal density raster for the entire polygon and extract the sum of the density by zone. Points outside of the polygon should be discarded.
library(raster)
library(tidyverse)
library(sf)
library(spatstat)
library(maptools)
load(url("https://www.dropbox.com/s/iv1s5butsx2v01r/example.RData?dl=1"))
# alternatively, links to gists for each object
# https://gist.github.com/ericpgreen/d80665d22dfa1c05607e75b8d2163b84
# https://gist.github.com/ericpgreen/7f4d3cee3eb5efed5486f7f713306e96
ggplot() +
geom_sf(data = zones) +
geom_sf(data = points) +
theme_minimal()
I tried converting to ppp with {spatstat} and then using density(), but I'm confused by the units in the result. I believe the problem is related to the units of the map, but I'm not sure how to proceed.
Update
Here's the code to reproduce the density map I created:
zones_owin <- as.owin(as_Spatial(zones))
pts <- st_coordinates(points)
p <- ppp(pts[,1], pts[,2], window=zones_owin, unitname=c("metre","metres"))
ds <- density(p)
r <- raster(ds)
plot(r)
Units are difficult when you work directly with geographic coordinates (lon, lat). If possible you should convert to planar coordinates (which is a requirement for spatstat) and proceed from there. The planar coordinates would typically be in units of meters, but I guess it depends on the specific projection and underlying ellipsoid etc. You can see this answer for how to project to planar coordinates with sf and export to spatstat format using maptools. Note: You have to manually choose a sensible projection (you can use http://epsg.io to find one) and you have to project both the polygon and the points.
Once everything is in spatstat format you can use density.ppp to do kernel smoothing. The resulting grid values (object of class im) are intensities of points, i.e., number of points per square unit (e.g. square meter). If you want to aggregate over some region you can use integral.im(..., domain = ...) to get the expected number of points in this region for a point process model with the given intensity.
I'm not sure if this answers all of your question, but should be a good start. Clarify in a comment or in your question should you need a different type of output.
It removes all points that are not inside one of the 'zone' polygons, counts them by zone and plots the zones colored by the number of points that fall within.
library(raster)
library(tidyverse)
library(sf)
#> Linking to GEOS 3.6.2, GDAL 2.2.3, PROJ 4.9.3
library(spatstat)
library(maptools)
#> Checking rgeos availability: TRUE
load(url("https://www.dropbox.com/s/iv1s5butsx2v01r/example.RData?dl=1"))
# alternatively, links to gists for each object
# https://gist.github.com/ericpgreen/d80665d22dfa1c05607e75b8d2163b84
# https://gist.github.com/ericpgreen/7f4d3cee3eb5efed5486f7f713306e96
p1 <- ggplot() +
geom_sf(data = zones) +
geom_sf(data = points) +
theme_minimal()
#Remove points outside of zones
points_inside <- st_intersection(points, zones)
#> although coordinates are longitude/latitude, st_intersection assumes that they are planar
#> Warning: attribute variables are assumed to be spatially constant throughout all
#> geometries
nrow(points)
#> [1] 308
nrow(points_inside)
#> [1] 201
p2 <- ggplot() +
geom_sf(data = zones) +
geom_sf(data = points_inside)
points_per_zone <- st_join(zones, points_inside) %>%
count(LocationID.x)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
p3 <- ggplot() +
geom_sf(data = points_per_zone,
aes(fill = n)) +
scale_fill_viridis_c(option = 'C')
points_per_zone
#> Simple feature collection with 4 features and 2 fields
#> geometry type: POLYGON
#> dimension: XY
#> bbox: xmin: 34.0401 ymin: -1.076718 xmax: 34.17818 ymax: -0.9755066
#> epsg (SRID): 4326
#> proj4string: +proj=longlat +ellps=WGS84 +no_defs
#> # A tibble: 4 x 3
#> LocationID.x n geometry
#> * <dbl> <int> <POLYGON [°]>
#> 1 10 129 ((34.08018 -0.9755066, 34.0803 -0.9757393, 34.08046 -0.975…
#> 2 20 19 ((34.05622 -0.9959458, 34.05642 -0.9960835, 34.05665 -0.99…
#> 3 30 29 ((34.12994 -1.026372, 34.12994 -1.026512, 34.12988 -1.0266…
#> 4 40 24 ((34.11962 -1.001829, 34.11956 -1.002018, 34.11966 -1.0020…
cowplot::plot_grid(p1, p2, p3, nrow = 2, ncol = 2)
It seems I underestimated the difficulty of your problem. Is something like the plot below (& underlying data) what you're looking for?
It uses raster with ~50x50 grid, raster::focal with a window of 9x9 using the mean to interpolate the data.

Resources