Adjust centroids spatial polygons using sf - r

I have a shapefile of local government regions. I use sf_read() to import it into R as an SF object. I want to compute the distance between the local government regions. st_centroid() gives me polygon centroids and I can compute distance using st_distance().
regions <- st_read("~/Downloads/regions.shp")
regions_with_centroids <- st_centroid(regions)
extract_centroids <- regions_with_centroids %>%
st_drop_geometry() %>%
as_tibble() %>%
select(region_name, centroid)
# create edge list
edge_list <- extract_centroids %>%
select(region_name) %>%
expand(from = region_name, to = region_name) %>%
filter(from < to) %>%
left_join(extract-centroids, by = c("from" = "region_name) %>%
rename(from_centroid = centroid) %>%
left_join(extract-centroids, by = c("to" = "region_name) %>%
rename(to_centroid = centroid) %>%
mutate(distance = st_distance(from_centroid, to_centroid)
However, I really want to analyze the commuting distance between major urban areas in each government region. I need to shift the centroids to the population "centre of gravity".
I can use a shapefile of census enumerator areas to help me with this. The enumerator areas are sized by population. Using st_intersection() I can intersect the enumerator areas with the government regions. This gives me sub-regions within each government region. I can compute centroids for all the sub-regions. Grouping by region, I can compute the mean centroid for all the sub-regions in a region. The mean centroid = "centre of gravity", which gives a more realistic commute distance between regions.
regions <- st_read("~/Downloads/regions.shp")
ea <- st_read("~/Downloads/enumerator_areas.shp")
intersected <- st_intersection(regions, ea)
sub_region_centroids <- st_centroids(intersected)
Where I run into difficulty is how to find the mean centroid. Grouping by region is not working.
mean_centroid <- sub_region_centroids %>%
group_by(region_name) %>%
summarise(mean_centroid = mean(geometry))
Warning messages:
1: In mean.default(geometry) :
the argument is not numeric or logical: returning NA
Where am I going wrong?
I also do not know how to add the mean centroid back to the original region's object.
I hope someone can assist me.

Computing a population weighted average of multiple centroids is an interesting problem.
You can consider approach like this - where I calculate the weighted centroid of three cities in North Carolina (to make use of the well known & much loved nc.shp file that ships with {sf}).
The workflow uses tidyr::uncount() to first multiply the city points per population, the (many) multiplied points are then united to a single multipoint feature. And multipoint features have defined sf::st_centroid() operation (QED). The final sf::st_as_sf() is just a polish.
library(sf)
library(dplyr)
library(ggplot2)
# included with sf package
shape <- st_read(system.file("shape/nc.shp", package="sf"))
# dramatis personae; population as per Wikipedia
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333),
population = c(467665, 299035, 115451)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326)
# a quick overview of facts on ground
ggplot() +
geom_sf(data = shape) + # polygon of North Carolina
geom_sf(data = cities, color = "red") # 3 cities
# unweighted centroid / a baseline
plain_center <- cities %>%
st_geometry() %>% # pull just geometry
st_combine() %>% # from many points to a single multipoint
st_centroid() %>% # compute centroid of the multipoint
st_as_sf() # make it a sf object again
# the fun is here!!
center_of_centers <- cities %>%
tidyr::uncount(population) %>% # multiply rows according to population
st_geometry() %>% # pull just geometry
st_combine() %>% # from many points to a single multipoint
st_centroid() %>% # compute centroid of the multipoint
st_as_sf() # make it a sf object again
# finished result
ggplot() +
geom_sf(data = shape, color = "gray75") + # polygon of North Carolina
geom_sf(data = cities, color = "red") + # 3 cities
geom_sf(data = plain_center, color = "green") + # unweighted center
geom_sf(data = center_of_centers, color = "blue", pch = 4) # population weighted center

Following #Jindra Lacko's nice example, here is how it can be done taking the weighted mean of the lat and long.
library(sf)
library(dplyr)
library(ggplot2)
# weighted mean of lat and long
center_weighted <- cities %>%
mutate(lon = sf::st_coordinates(.)[,1],
lat = sf::st_coordinates(.)[,2]) %>%
st_drop_geometry() %>%
summarize(across(c(lon, lat), weighted.mean, w = population)) %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326)
# plot it
ggplot() +
geom_sf(data = shape, color = "gray75") +
geom_sf(data = cities, color = "red") +
geom_sf(data = center_weighted, color = "blue", pch = 4)
Data
# set up example data
shape <- st_read(system.file("shape/nc.shp", package="sf"))
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333),
population = c(467665, 299035, 115451)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326)

Related

Plot filled areas for sea/ocean and land mass based on {osmdata} using {ggplot2} [duplicate]

This question already has an answer here:
Sf package: Close a polygon fom complex shape
(1 answer)
Closed 5 months ago.
The reprex below shows how I would like to create a map via {osmdata} and {ggplot2} that has sea/ocean in it. I want to color-fill the land and/or sea area. However, it seems unexpectedly difficult to do so. This blog post even claims that it cannot be done.
This vignette of {osmplotr} seems to have to the solution: "Because OpenStreetMap represents coastline as line objects, all coastline data is contained within the $osm_lines object. The osm_line2poly() function can then convert these lines to polygons which can be used to plot filled areas.". Yet, just as in this similar StackOverflow question, the function throws an error as can be seen at the bottom of the reprex. I also found here that the {tigris} package can provide the necessary polygon data - but only for the US.
So how can I get this to work?
library(osmdata)
library(osmplotr)
library(sf)
library(tidyverse)
# define example bbox
bb <- tribble(
~xy, ~min, ~max,
"x", 12.00, 12.18,
"y", 54.08, 54.20
) %>% column_to_rownames("xy") %>% as.matrix()
# get "water"
water <- opq(bb) %>%
add_osm_feature(key = "natural", value = "water") %>%
osmdata_sf()
# get "coastline"
coast <- opq(bb) %>%
add_osm_feature(key = "natural", value = "coastline") %>%
osmdata_sf()
# ggplot
ggplot() +
geom_sf(
data = water$osm_multipolygons,
fill = "navy",
color = NA
) +
geom_sf(
data = coast$osm_lines,
fill = "navy",
color = "blue"
)
# trying osm_line2poly()
osmplotr::osm_line2poly(coast$osm_lines, bb)
#> Error in FUN(X[[i]], ...): unbenutztes Argument (V = c(3, 1, 6, 7, 2, NA, 5))
Created on 2022-09-23 with reprex v2.0.2
Thanks to #JindraLacko, I was able to make my reprex work. Basically, we create a rectangle/polygon which is the size of our bbox and then split it via the coastline.
library(lwgeom)
library(osmdata)
library(osmplotr)
library(sf)
library(tidyverse)
### define example bbox
lon_min <- 12.00 # xmin
lon_max <- 12.18 # xmax
lat_min <- 54.08 # ymin
lat_max <- 54.20 # ymax
bb <- get_bbox(c(lon_min, lat_min, lon_max, lat_max))
### get "water" that is not sea as polygons
water <- opq(bb) %>%
add_osm_feature(key = "natural", value = "water") %>%
osmdata_sf()
### get sea & land as polygons
# 1. get coastline (as line)
coast <- opq(bb) %>%
add_osm_feature(key = "natural", value = "coastline") %>%
osmdata_sf()
# 2. get overall rectangle for bbox
bb_rect <- data.frame(
lat = c(lat_min, lat_max),
lon = c(lon_min, lon_max)
) %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
st_bbox() %>%
st_as_sfc()
# 3. split overall rectangle for bbox via coastline
bb_rect_split <- bb_rect %>%
st_split(coast$osm_lines) %>%
st_collection_extract("POLYGON")
# 4. extract splitted rectangle parts
land <- bb_rect_split[1]
sea <- bb_rect_split[2]
### ggplot
ggplot() +
geom_sf(
data = land,
fill = "bisque",
color = NA
) +
geom_sf(
data = sea,
fill = "navy",
color = NA
) +
geom_sf(
data = water$osm_multipolygons,
fill = "navy",
color = NA
)
Created on 2022-09-26 with reprex v2.0.2

How to fill data gaps with random coordinates from a polygon?

I have a data set of whale sightings with some coordinate gaps but associated with areas of reference, of which I have coordinate limits. I've used these limits to create a polygon restricted to the marine environment (using library ‘sf’) for each of the areas. Now I would like to fill the coordinate gaps by randomly selecting latitudes and longitudes from the polygons.
My piece of code (example for the area 'Angola'):
#Creating a ocean-only polygon for the Southern Hemisphere (my study area)
x_coord = c(180, 180, -180, -180)
y_coord = c(0, -90, -90, 0)
polygonSH = cbind(x_coord, y_coord) %>%
st_linestring() %>%
st_cast("POLYGON") %>%
st_sfc(crs = 4326, check_ring_dir = TRUE) %>%
st_sf()
land = rnaturalearth::ne_countries(returnclass = "sf") %>%
st_union()
ocean = st_difference(polygonSH, land)
plot(st_geometry(land))
plot(st_geometry(polygonSH), add = TRUE)
plot(st_geometry(ocean), add = TRUE, col = "blue")
#Creating ocean-only polygons for each of the different areas to then use them in the arguments to run ramdon coords
#Angola
x_angola = c(11.72,11.72,13.58,13.58) #longitude limits of Angola area
y_angola = c(-12.34,-16.6,-16.6,-12.34) #latitude limits of Angola area
polygon_angola = cbind(x_angola, y_angola) %>%
st_linestring() %>%
st_cast("POLYGON") %>%
st_sfc(crs = 4326, check_ring_dir = TRUE) %>%
st_sf()
plot(st_geometry(land))
plot(st_geometry(polygon_angola), add = TRUE)
angola_ocean = st_difference (polygon_angola, land)
plot(st_geometry(angola_ocean), add = TRUE, col = "pink")
...
Before having the polygons restricted to the marine environment, I've used the code below to randonmly generate the coordinates, and ideally I would like to use something similar, but adjusted to working with spatial data:
for(i in 1:dim(x)[1]) {
x[i,"lat"] <- ifelse(is.na(x[i,"lat"]) && x[i,"area"]=="Angola", runif(1,-16.6,-12.34), x[i,"lat"])
x[i,"long"] <- ifelse(is.na(x[i,"long"]) && x[i,"area"]=="Angola", runif(1, 11.72,13.58), x[i,"long"])
}
I would really appreciate having folk's input on this issue.
I can't get your code to work due to issues (invalid spherical geometry) not directly related to the subject of the question.
So please allow me to illustrate a possible approach using the well known & much loved North Carolina shapefile that ships with the {sf}.
library(sf)
library(dplyr)
# included with sf package
shape <- st_read(system.file("shape/nc.shp", package="sf")) %>%
summarise() # a single polygon
# now the action! 50 random points over North Carolina
random_points <- shape %>%
st_sample(50)
# check results...
plot(shape)
plot(random_points, col = "red", pch = 4, add = T)

Projecting a quartered circle with a 50km radius in r/sf?

I'm hoping to create a series of quartered circles (i.e. circles split into 4 equal quadrants), each with a 50km radius, that I can map onto various longitudes and latitudes throughout the United States. I'd also like the option to rotate these quartered circles as desired.
Using the code below (and guidance from here), I've been able to make the following start:
New York State Map
I have two questions:
How can I meaningfully set the radius of these circles? Is there a way to draw shapes a certain distance (in km) from a coordinate in a projected CRS? So far I'm defining the radius in terms of degrees of longitude and latitude, but distance would be more useful.
My circles appear to be turning into ellipses after projecting them and mapping them in WGS84. Is there any way to prevent this from happening?
I would be happy to consider alternative approaches. Thanks!
library(sf)
library(ggplot2)
library(maps)
#Two functions to create coordinate quartered circle polygons
#x = long, y = lay, r = radius, theta_rotate = rotation
st_wedge <- function(x,y,r,start,width, theta_rotate){
n <- 20
theta = seq(start+theta_rotate, start+width+theta_rotate, length=n)
xarc = x + r*sin(theta)
yarc = y + r*cos(theta)
xc = c(x, xarc, x)
yc = c(y, yarc, y)
st_polygon(list(cbind(xc,yc)))
}
st_wedges <- function(x, y, r, nsegs, theta_rotatex){
width = (2*pi)/nsegs
starts = (1:nsegs)*width
polys = lapply(starts, function(s){st_wedge(x,y,r,s,width, theta_rotatex)})
#Cast to crs 4326, WGS84
mpoly = st_cast((st_sfc(polys, crs = 4326)), "MULTIPOLYGON")
mpoly
}
#Create quartered sf circle polygon
custom_circle_sf <- st_wedges(x = -76, y = 43, r = .3, nsegs = 4, theta_rotatex = 200) %>%
st_sf() %>%
mutate(group = row_number()) %>% dplyr::select(group, geometry)
#Create New York State sf polygon
ny_map_sf <- map_data("state", region="new york") %>%
st_as_sf(coords = c("long", "lat"), crs = 4326) %>%
group_by(group) %>%
summarise(geometry = st_combine(geometry)) %>%
st_cast("POLYGON")
#Plot results
ggplot() +
geom_sf(data=ny_map_sf,
size = 1,
colour = "blue",
fill = "white") +
geom_sf(data=custom_circle_sf,
size = .1,
aes(fill=group),
colour = "white")
For anyone who is curious about splitting polygons in sf using R, this was how I went about solving this:
#Function to create circle with quadrants. Save desired projection as projected_crs
create_circle <- function(lat_x, long_y, theta_x, buffer_m){
#Create circle with radius buffer_m centered at (lat_x, long_y)
circle_buffer <- st_point(c(lat_x, long_y)) %>% st_sfc(crs = 4326) %>%
st_cast("POINT") %>%
st_transform(projected_crs) %>%
st_buffer(buffer_m)
#Create two orthogonal lines at origin
p1 <- rbind(c(lat_x,long_y - 1), c(lat_x,long_y + 1))
p2 <- rbind(c(lat_x+1,long_y), c(lat_x-1,long_y))
mls <- st_multilinestring(list(p1,p2)) %>% st_sfc(crs = 4326) %>%
st_transform(projected_crs)
#Use orthogonal lines to split circle into 4 quadrants
x1 <- st_split(circle_buffer, mls)
#Convert origin into projected CRS
center_in_crs <- st_point(c(lat_x, long_y)) %>%
st_sfc(crs = 4326) %>%
st_transform(projected_crs)
sp_obj <- x1 %>% st_collection_extract(type="POLYGON") %>%
#Convert to spatial to use sp functions
as_Spatial() %>%
#rotate x degrees
elide(rotate = theta_x + 45, center = center_in_crs[[1]]) %>%
#return to sf
st_as_sf()
Regarding your question 2: "circles appear to be turning into ellipses". If you add to your ggplot the coord_equal() function then the grid will be square, and the ellipses will be shown as circles.

Ploting a Buffer Around a Point on a Map - R SF

I've been trying to plot a buffer around a point on a map but when I do the buffer doesn't appear in the right place like this.
Faulty R Map
The correct location is in California.
Here's my code:
library(tigris)
library(sf)
library(tidyverse)
projection <- 102003
options(tigris_use_cache = TRUE)
county_polys <- counties(class = 'sf') %>%
filter(STATEFP %in% c('06','41','53','04','16','32','49')) %>%
st_transform(projection)
centroids <- county_polys %>%
as_tibble %>% select(INTPTLON,INTPTLAT) %>%
mutate(
INTPTLON = as.double(INTPTLON),
INTPTLAT = as.double(INTPTLAT)) %>%
st_as_sf(coords = c('INTPTLON','INTPTLAT'), crs = projection)
pt <- centroids[2,]
pt_buffer <- st_buffer(pt,150000)
ggplot() + geom_sf(data = county_polys) + geom_sf(data = pt_buffer,color = 'red')
We can use the st_centroid function to get the centroid to avoid errors. There is no need to convert the sf object to other classes.
# This is the only thing I changed from your original code
# Get the centroid by st_centroid
centroids <- county_polys %>% st_centroid()
pt <- centroids[2,]
pt_buffer <- st_buffer(pt,150000)
ggplot() + geom_sf(data = county_polys) + geom_sf(data = pt_buffer,color = 'red')

Proximity Maps using R

I'm looking to create some proximity maps using R, which show how far areas are from certain points. I can't find any examples in R code, but I've found an output which is the sort of thing I want:
It doesn't necessarily have to have all the labelling/internal boundaries wizardry, but I'd like it to stop at the sea border (thinking of using the rgeos function gintersection - see here).
I've tried doing a density plot as 'heatmaps' (this would be a pretty good solution/alternative) and putting a shapefile over the top (following this suggestion, but they're not lining up and I can't do a gintersection, probably because there's not a coordinate system attached to the density plot.
I used your question to play a little with new libraries...
Get a UK map and define random points
library(raster)
library(sf)
library(ggplot2)
library(dplyr)
library(tidyr)
library(forcats)
library(purrr)
# Get UK map
GBR <- getData(name = "GADM", country = "GBR", level = 1)
GBR_sf <- st_as_sf(GBR)
# Define 3 points on the UK map
pts <- matrix(c(-0.4966766, -2.0772529, -3.8437793,
51.91829, 52.86147, 56.73899), ncol = 2)
# Project in mercator to allow buffer with distances
pts_sf <- st_sfc(st_multipoint(pts), crs = 4326) %>%
st_sf() %>%
st_transform(27700)
ggplot() +
geom_sf(data = GBR_sf) +
geom_sf(data = pts_sf, colour = "red")
Calculate buffer areas
We create a list of multipolygons for each buffer distance. The point dataset must be in projected coordinates (here mercator) as buffer distance is in the scale of the coordinates system.
# Define distances to buffer
dists <- seq(5000, 150000, length.out = 5)
# Create buffer areas with each distances
pts_buf <- purrr::map(dists, ~st_buffer(pts_sf, .)) %>%
do.call("rbind", .) %>%
st_cast() %>%
mutate(
distmax = dists,
dist = glue::glue("<{dists/1000} km"))
# Plot: alpha allows to see overlapping polygons
ggplot() +
geom_sf(data = GBR_sf) +
geom_sf(data = pts_buf, fill = "red",
colour = NA, alpha = 0.1)
Remove overlapping
Buffer areas are overlapping. On the figure above, the more intense red color is due to multiple overlapping layers of transparent red. Let's remove the overlapping. We need to remove from larger areas, the buffer with the lower size. I then need to add again the smallest area to the list.
# Remove part of polygons overlapping smaller buffer
pts_holes <- purrr::map2(tail(1:nrow(pts_buf),-1),
head(1:nrow(pts_buf),-1),
~st_difference(pts_buf[.x,], pts_buf[.y,])) %>%
do.call("rbind", .) %>%
st_cast() %>%
select(-distmax.1, -dist.1)
# Add smallest polygon
pts_holes_tot <- pts_holes %>%
rbind(filter(pts_buf, distmax == min(dists))) %>%
arrange(distmax) %>%
mutate(dist = forcats::fct_reorder(dist, distmax))
# Plot and define color according to dist
ggplot() +
geom_sf(data = GBR_sf) +
geom_sf(data = pts_holes_tot,
aes(fill = dist),
colour = NA) +
scale_fill_brewer(direction = 2)
Remove areas in the sea
If you want to find proximity area on terrestrial parts only, we need to remove buffer areas that are in the sea. Intersection is computed between multipolygons with the same projection. I previously realize an union of the UK map.
# Remove part of polygons in the sea
# Union and projection of UK map
GBR_sf_merc <- st_transform(st_union(GBR_sf), 27700)
pts_holes_uk <- st_intersection(pts_holes_tot,
GBR_sf_merc)
ggplot() +
geom_sf(data = GBR_sf) +
geom_sf(data = pts_holes_uk,
aes(fill = dist),
colour = NA) +
scale_fill_brewer(direction = 2)
And here is the final proximity map using sf, ggplot2 and a few other libraries...
Based on Sébastien's example, a more old-fashioned approach:
library(raster)
GBR <- getData(name = "GADM", country = "GBR", level = 1)
pts <- matrix(c(-0.4966766, -2.0772529, -3.8437793, 51.91829, 52.86147, 56.73899), ncol = 2)
r <- raster(GBR, res=1/12)
d <- distanceFromPoints(r, pts)
m <- mask(d, GBR)
plot(m)

Resources