Related
I have a shapefile of local government regions. I use sf_read() to import it into R as an SF object. I want to compute the distance between the local government regions. st_centroid() gives me polygon centroids and I can compute distance using st_distance().
regions <- st_read("~/Downloads/regions.shp")
regions_with_centroids <- st_centroid(regions)
extract_centroids <- regions_with_centroids %>%
st_drop_geometry() %>%
as_tibble() %>%
select(region_name, centroid)
# create edge list
edge_list <- extract_centroids %>%
select(region_name) %>%
expand(from = region_name, to = region_name) %>%
filter(from < to) %>%
left_join(extract-centroids, by = c("from" = "region_name) %>%
rename(from_centroid = centroid) %>%
left_join(extract-centroids, by = c("to" = "region_name) %>%
rename(to_centroid = centroid) %>%
mutate(distance = st_distance(from_centroid, to_centroid)
However, I really want to analyze the commuting distance between major urban areas in each government region. I need to shift the centroids to the population "centre of gravity".
I can use a shapefile of census enumerator areas to help me with this. The enumerator areas are sized by population. Using st_intersection() I can intersect the enumerator areas with the government regions. This gives me sub-regions within each government region. I can compute centroids for all the sub-regions. Grouping by region, I can compute the mean centroid for all the sub-regions in a region. The mean centroid = "centre of gravity", which gives a more realistic commute distance between regions.
regions <- st_read("~/Downloads/regions.shp")
ea <- st_read("~/Downloads/enumerator_areas.shp")
intersected <- st_intersection(regions, ea)
sub_region_centroids <- st_centroids(intersected)
Where I run into difficulty is how to find the mean centroid. Grouping by region is not working.
mean_centroid <- sub_region_centroids %>%
group_by(region_name) %>%
summarise(mean_centroid = mean(geometry))
Warning messages:
1: In mean.default(geometry) :
the argument is not numeric or logical: returning NA
Where am I going wrong?
I also do not know how to add the mean centroid back to the original region's object.
I hope someone can assist me.
Computing a population weighted average of multiple centroids is an interesting problem.
You can consider approach like this - where I calculate the weighted centroid of three cities in North Carolina (to make use of the well known & much loved nc.shp file that ships with {sf}).
The workflow uses tidyr::uncount() to first multiply the city points per population, the (many) multiplied points are then united to a single multipoint feature. And multipoint features have defined sf::st_centroid() operation (QED). The final sf::st_as_sf() is just a polish.
library(sf)
library(dplyr)
library(ggplot2)
# included with sf package
shape <- st_read(system.file("shape/nc.shp", package="sf"))
# dramatis personae; population as per Wikipedia
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333),
population = c(467665, 299035, 115451)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326)
# a quick overview of facts on ground
ggplot() +
geom_sf(data = shape) + # polygon of North Carolina
geom_sf(data = cities, color = "red") # 3 cities
# unweighted centroid / a baseline
plain_center <- cities %>%
st_geometry() %>% # pull just geometry
st_combine() %>% # from many points to a single multipoint
st_centroid() %>% # compute centroid of the multipoint
st_as_sf() # make it a sf object again
# the fun is here!!
center_of_centers <- cities %>%
tidyr::uncount(population) %>% # multiply rows according to population
st_geometry() %>% # pull just geometry
st_combine() %>% # from many points to a single multipoint
st_centroid() %>% # compute centroid of the multipoint
st_as_sf() # make it a sf object again
# finished result
ggplot() +
geom_sf(data = shape, color = "gray75") + # polygon of North Carolina
geom_sf(data = cities, color = "red") + # 3 cities
geom_sf(data = plain_center, color = "green") + # unweighted center
geom_sf(data = center_of_centers, color = "blue", pch = 4) # population weighted center
Following #Jindra Lacko's nice example, here is how it can be done taking the weighted mean of the lat and long.
library(sf)
library(dplyr)
library(ggplot2)
# weighted mean of lat and long
center_weighted <- cities %>%
mutate(lon = sf::st_coordinates(.)[,1],
lat = sf::st_coordinates(.)[,2]) %>%
st_drop_geometry() %>%
summarize(across(c(lon, lat), weighted.mean, w = population)) %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326)
# plot it
ggplot() +
geom_sf(data = shape, color = "gray75") +
geom_sf(data = cities, color = "red") +
geom_sf(data = center_weighted, color = "blue", pch = 4)
Data
# set up example data
shape <- st_read(system.file("shape/nc.shp", package="sf"))
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333),
population = c(467665, 299035, 115451)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326)
I'm hoping to create a series of quartered circles (i.e. circles split into 4 equal quadrants), each with a 50km radius, that I can map onto various longitudes and latitudes throughout the United States. I'd also like the option to rotate these quartered circles as desired.
Using the code below (and guidance from here), I've been able to make the following start:
New York State Map
I have two questions:
How can I meaningfully set the radius of these circles? Is there a way to draw shapes a certain distance (in km) from a coordinate in a projected CRS? So far I'm defining the radius in terms of degrees of longitude and latitude, but distance would be more useful.
My circles appear to be turning into ellipses after projecting them and mapping them in WGS84. Is there any way to prevent this from happening?
I would be happy to consider alternative approaches. Thanks!
library(sf)
library(ggplot2)
library(maps)
#Two functions to create coordinate quartered circle polygons
#x = long, y = lay, r = radius, theta_rotate = rotation
st_wedge <- function(x,y,r,start,width, theta_rotate){
n <- 20
theta = seq(start+theta_rotate, start+width+theta_rotate, length=n)
xarc = x + r*sin(theta)
yarc = y + r*cos(theta)
xc = c(x, xarc, x)
yc = c(y, yarc, y)
st_polygon(list(cbind(xc,yc)))
}
st_wedges <- function(x, y, r, nsegs, theta_rotatex){
width = (2*pi)/nsegs
starts = (1:nsegs)*width
polys = lapply(starts, function(s){st_wedge(x,y,r,s,width, theta_rotatex)})
#Cast to crs 4326, WGS84
mpoly = st_cast((st_sfc(polys, crs = 4326)), "MULTIPOLYGON")
mpoly
}
#Create quartered sf circle polygon
custom_circle_sf <- st_wedges(x = -76, y = 43, r = .3, nsegs = 4, theta_rotatex = 200) %>%
st_sf() %>%
mutate(group = row_number()) %>% dplyr::select(group, geometry)
#Create New York State sf polygon
ny_map_sf <- map_data("state", region="new york") %>%
st_as_sf(coords = c("long", "lat"), crs = 4326) %>%
group_by(group) %>%
summarise(geometry = st_combine(geometry)) %>%
st_cast("POLYGON")
#Plot results
ggplot() +
geom_sf(data=ny_map_sf,
size = 1,
colour = "blue",
fill = "white") +
geom_sf(data=custom_circle_sf,
size = .1,
aes(fill=group),
colour = "white")
For anyone who is curious about splitting polygons in sf using R, this was how I went about solving this:
#Function to create circle with quadrants. Save desired projection as projected_crs
create_circle <- function(lat_x, long_y, theta_x, buffer_m){
#Create circle with radius buffer_m centered at (lat_x, long_y)
circle_buffer <- st_point(c(lat_x, long_y)) %>% st_sfc(crs = 4326) %>%
st_cast("POINT") %>%
st_transform(projected_crs) %>%
st_buffer(buffer_m)
#Create two orthogonal lines at origin
p1 <- rbind(c(lat_x,long_y - 1), c(lat_x,long_y + 1))
p2 <- rbind(c(lat_x+1,long_y), c(lat_x-1,long_y))
mls <- st_multilinestring(list(p1,p2)) %>% st_sfc(crs = 4326) %>%
st_transform(projected_crs)
#Use orthogonal lines to split circle into 4 quadrants
x1 <- st_split(circle_buffer, mls)
#Convert origin into projected CRS
center_in_crs <- st_point(c(lat_x, long_y)) %>%
st_sfc(crs = 4326) %>%
st_transform(projected_crs)
sp_obj <- x1 %>% st_collection_extract(type="POLYGON") %>%
#Convert to spatial to use sp functions
as_Spatial() %>%
#rotate x degrees
elide(rotate = theta_x + 45, center = center_in_crs[[1]]) %>%
#return to sf
st_as_sf()
Regarding your question 2: "circles appear to be turning into ellipses". If you add to your ggplot the coord_equal() function then the grid will be square, and the ellipses will be shown as circles.
I'm looking to crop the density plot to only land while keeping to sf.
Here's a simple example problem:
library(tidyverse)
library(sf)
library(albersusa)
library(ggthemes)
library(jsonlite)
dat <-
fromJSON(
"https://services1.arcgis.com/Hp6G80Pky0om7QvQ/arcgis/rest/services/Fortune_500_Corporate_Headquarters/FeatureServer/0/query?where=1%3D1&outFields=LATITUDE,LONGITUDE,NAME,PROFIT&outSR=4326&f=json"
)
dat <- as.data.frame(dat$features$attributes)
top_50 <- dat %>%
arrange(desc(PROFIT)) %>%
head(50)
ggplot() +
geom_sf(data = usa_sf()) +
geom_density_2d_filled(aes(x = LONGITUDE, y = LATITUDE),
data = top_50,
alpha = .5) +
xlim(-125,-66.5) +
ylim(20, 50) +
theme_map() +
theme(legend.position = "none")
Not sure if I'm getting close to a solution but here's some of the code I've been trying:
test <- (MASS::kde2d(
top_50$LONGITUDE, top_50$LATITUDE,
lims = c(-125,-66.5, 20, 50)
))
ggpoly2sf <- function(poly, coords = c("long", "lat"),
id = "group", region = "region", crs = 4326) {
sf::st_as_sf(poly, coords = coords, crs = crs) %>%
group_by(!! as.name(id), !! as.name(region)) %>%
summarize(do_union=FALSE) %>%
sf::st_as_sf("POLYGON") %>%
ungroup() %>%
group_by(!! as.name(region)) %>%
summarize(do_union = TRUE) %>%
ungroup()
}
v <- contourLines(test)
vv <- v
for (i in seq_along(v)) vv[[i]]$group <- i
vv <- do.call(rbind, lapply(vv, as.data.frame))
dsi_sf <- ggpoly2sf(vv, coords = c("x", "y"), region = "level") %>% st_as_sf()
usa <- usa_sf()
dsi_i_sf <- st_intersection(usa$geometry, dsi_sf)
ggplot() +
geom_sf(data=usa) +
geom_sf(data=dsi_i_sf,color="red") +
geom_density2d_filled(aes(x = LONGITUDE, y = LATITUDE),
data = top_50,alpha=.4) +
xlim(-125,-66.5) +
ylim(20, 50) +
theme(legend.position = "none")
Create a rectangle of the same plot dimensions:
rec_box <- data.frame(x=c(-125,-125,-66.5,-66.5,-125), y=c(20,50,50,20,20))
Create an outline of the US and extract only the lat/lon points into a dataframe:
outline <- map("usa", plot=FALSE)
outline <- data.frame(x=outline$x,y=outline$y)
Bind the two together to create a polygon with a hole in the middle:
mask <- rbind(rec_box,outline)
Add a geom_polygon() to plot the mask data and color appropriately:
geom_polygon(data=mask,
aes(x=x,y=y),color="white",fill="white")
Everything combined:
library(tidyverse)
library(sf)
library(albersusa)
library(ggthemes)
library(jsonlite)
dat <-
fromJSON(
"https://services1.arcgis.com/Hp6G80Pky0om7QvQ/arcgis/rest/services/Fortune_500_Corporate_Headquarters/FeatureServer/0/query?where=1%3D1&outFields=LATITUDE,LONGITUDE,NAME,PROFIT&outSR=4326&f=json"
)
dat <- as.data.frame(dat$features$attributes)
top_50 <- dat %>%
arrange(desc(PROFIT)) %>%
head(50)
usa <- usa_sf()
outline <- map("usa", plot=FALSE)
outline <- data.frame(x=outline$x,y=outline$y)
rec_box <- data.frame(x=c(-125,-125,-66.5,-66.5,-125), y=c(20,50,50,20,20))
mask <- rbind(rec_box,outline)
ggplot() +
geom_sf(data = usa_sf()) +
geom_density_2d_filled(aes(x = LONGITUDE, y = LATITUDE),
data = top_50,
alpha = .5) +
xlim(-125,-66.5) +
ylim(20, 50) +
geom_polygon(data=mask,
aes(x=x,y=y),color="white",fill="white") +
theme_map() +
theme(legend.position = "none")
Really a thing of beauty.
For a mask layer over the US with AK & HI inset:
library(tidyverse)
library(sf)
library(albersusa)
library(ggthemes)
library(jsonlite)
library(spatstat)
dat <-
fromJSON(
"https://services1.arcgis.com/Hp6G80Pky0om7QvQ/arcgis/rest/services/Fortune_500_Corporate_Headquarters/FeatureServer/0/query?where=1%3D1&outFields=LATITUDE,LONGITUDE,NAME,PROFIT&outSR=4326&f=json"
)
dat <- as.data.frame(dat$features$attributes)
top_50 <- dat %>%
arrange(desc(PROFIT)) %>%
head(50)
usa <- usa_sf()
top50sf <- st_as_sf(top_50, coords = c("LONGITUDE", "LATITUDE")) %>%
st_set_crs(4326) %>%
st_transform(st_crs(usa))
# usa polygons combined
usa_for_mask <- usa_sf() %>%
st_geometry() %>%
st_cast('POLYGON') %>%
st_union()
# bounding box of us & inset AK + HI,
# expand as needed
us_bbox <- st_bbox(usa_for_mask) %>%
st_as_sfc() %>%
st_as_sf()
us_mask <- st_difference(us_bbox, usa_for_mask)
ggplot() +
geom_sf(data = usa) +
geom_density_2d_filled(aes(x = LONGITUDE, y = LATITUDE),
data = top_50,
alpha = .5) +
geom_sf(data = us_mask, fill = 'white') +
xlim(-125,-66.5) +
ylim(20, 50) +
theme_map() +
theme(legend.position = "none")
Created on 2021-04-05 by the reprex package (v1.0.0)
You can expand the bounding box to get rid of the purple border around the plot.
This does what you're asking for, but almost certainly isn't spatially accurate. It can get a point across to a general audience, but don't make any big decisions based on it.
More accurate spatial interpolation methods can be found here:
https://rspatial.org/raster/analysis/4-interpolation.html
https://mgimond.github.io/Spatial/interpolation-in-r.html
I'm looking to create a masking polygon for multiple polygons.
It's easy enough for a single polygon:
How to apply a polygon mask layer in ggplot
but much trickier for multiple:
https://www.stat.auckland.ac.nz/~paul/Reports/GraphicsEngine/definitions/definitions.html
I feel like I'm fairly close but I need a method of drawing a line between a point for the closest polygon, i.e. I'd like to try and connect nearest islands with a line.
While also connecting those polygons to an outer polygon box but only with a single line:
library(tidyverse)
library(sf)
library(albersusa)
usa <- usa_sf()
HI <- st_coordinates(usa %>%
filter(name %in% c("Hawaii"))) %>%
as.data.frame() %>%
select(X, Y)
rec_box <-
data.frame(
X = c(-108,-108,-101,-101,-108),
Y = c(24, 28, 28, 24, 24)
)
mask <- rbind(HI, rec_box)
eg <- st_as_sf(data.frame(mask), coords = c("X", "Y"))
poly <- st_convex_hull(eg)
ggplot() +
geom_sf(data = poly) +
geom_density2d_filled(data = HI, aes(x = X, y = Y)) +
geom_polygon(data = mask,
aes(x = X, y = Y),
color = "black",
fill = "white")
Edit:
For only a mask around the islands:
library(tidyverse)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(albersusa)
#library(concaveman)
#library(nngeo)
usa <- usa_sf()
# keep only the geometry from Hawaii (not the data)
HI <- usa %>%
filter(name %in% c("Hawaii")) %>%
st_cast('POLYGON') %>%
st_geometry() %>%
st_as_sf()
# HI as dataframe as your example
HI2 <- st_coordinates(usa %>%
filter(name %in% c("Hawaii"))) %>%
as.data.frame() %>%
select(X, Y)
# you may need a larger box. This uses sf:st_bbox to get exact
# rectangle of the polygons. adjust accordingly
hi_box <- st_bbox(HI) %>% st_as_sfc() %>% st_as_sf()
# find the difference betwee HI & hi_box
hi_mask <- st_difference(hi_box, st_union(HI))
ggplot() +
geom_density_2d_filled(data = HI2, aes(x = X, y = Y)) +
geom_sf(data = hi_mask, fill = 'black')
Created on 2021-04-04 by the reprex package (v0.3.0)
If you're happy with the HI mask created above, you can connect it to your defined bounding box using a single line with nngeo::st_connect().
This should find the shortest connecting line between the two. If you prefer the connecting line to be at a specific point (southwest corner in your example above), supply it as an sf object to the st_connect() function instead of the entire box.
library(tidyverse)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(albersusa)
library(concaveman)
library(nngeo)
usa <- usa_sf()
# Hawaii, cast to POLYGON rather than the original MULTIPOLYGON
HI <- (usa %>%
filter(name %in% c("Hawaii"))) %>%
st_cast('POLYGON')
connected_HI <- concaveman(HI)
# rec box as an sf object with same crs as HI
rec_box <-
data.frame(X = c(-108, -108, -101, -101, -108),
Y = c(24, 28, 28, 24, 24)) %>%
st_as_sf(coords = c('X', 'Y'
)) %>%
st_set_crs(st_crs(HI))
# connecting HI mask you made to your rec_box
connected_to_rec_box <- nngeo::st_connect(st_zm(connected_HI), rec_box)
ggplot() +
geom_sf(data = connected_HI, fill = 'turquoise', alpha = .4) +
geom_sf(data = connected_to_rec_box, color = 'red')
Created on 2021-04-04 by the reprex package (v0.3.0)
What I have:
points in the arctic and antarctic
raster data from various geophysical entities in arctic and antarctic
What I want:
A map in stereographic or any other polar projection with background map or coastlines, cropped to the extent of the points. In other words: A map like above with base map of my own choice.
What I did so far:
I loaded all the data (including land surface data from naturalearthdata; see MWE), projected them into stereographic and plotted that. The result including the polygon data looks then like this:
My MWE:
library(raster)
library(sf)
library(ggplot2)
library(rgdal)
# file load ---------------------------------------------------------------
# sea ice raster data
if (!file.exists("seaiceraster.tif")) {
url = "https://seaice.uni-bremen.de/data/smos/tif/20100514_hvnorth_rfi_l1c.tif"
download.file(url, destfile = 'seaiceraster.tif')
}
si.raster = raster::raster('seaiceraster.tif')
# land surface shapefile
if (!file.exists("110m-admin-0-countries")) {
url_land = "https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_land.zip"
download.file(url_land, destfile = "110m-admin-0-countries")
unzip("110m-admin-0-countries")
}
world_shp = rgdal::readOGR("ne_10m_land.shp")
# points
p.data = structure(
list(
Lat = c(
73.0114126168676,70.325555278764,77.467797903163,
58.6423827457304,66.3616310851294,59.2097857474643,
75.3135274436283,60.1983078512275,72.6614399747201,
61.1566678672946,73.0822309615673,55.7759666826898,
75.1651656433833,69.0130753414173,62.3288262448589
),
Lon = c(
-59.9175490701543,-80.1900239630732,-40.4609968914928,
-61.0914448815381,-60.0703668488408,-21.027205418284,
-100.200463810276,-74.861777073788,-55.1093773178206,
-29.4108649230234,-64.5878251008461,-36.5343322019187,
-31.647365623387,-67.466355105829,-64.1162329769077
)
),
row.names = c(
1911L, 592L,2110L,3552L,3426L,1524L,635L,4668L,
3945L,2848L,3609L,36L,4262L,3967L,2725L
),
class = "data.frame"
)
p = sf::st_as_sf(p.data, coords = c("Lon", "Lat"),
crs = "+init=epsg:4326")
# project -----------------------------------------------------------------
polar.crs = CRS("+init=epsg:3995")
si.raster.proj = projectRaster(si.raster, crs = polar.crs)
world_shp.proj = sp::spTransform(world_shp, polar.crs)
p.proj = sf::st_transform(p, polar.crs)
# preparation -------------------------------------------------------------
AG = ggplot2::fortify(world_shp.proj)
# make raster to data.frame
si.raster.df = si.raster.proj %>%
raster::crop(., p.proj) %>%
raster::rasterToPoints(., spatial = TRUE) %>%
as.data.frame(.)
colnames(si.raster.df) = c("val", "x", "y")
# plot --------------------------------------------------------------------
ggplot() +
# geom_polygon(data = AG, aes(long, lat, group = group)) + # un-comment to see
geom_raster(data = si.raster.df, aes(x = x, y = y, fill = val)) +
geom_sf(data = p.proj, color = "green", size = 3)
I've changed the workflow in your example a bit to add the stars package for the sea ice data, but I think it should get you what you're looking for. You'll need to adjust the crop size to expand it a little, as the points p are right on the edge of the plotted area. st_buffer might help with that.
I used the crs from the seaicebuffer.tif file for all of the objects.
The .tif file has a crs that I'm not able to easily transform on my computer. It seems to be able to use meters as a lengthunit and might be a polar stereographic (variant B) projection. The points & world data don't seem to have a problem transforming to it though, which is why I've used it throughout.
library(raster)
library(sf)
library(ggplot2)
library(rgdal)
library(stars)
si <- stars::read_stars('seaiceraster.tif')
world_sf = rgdal::readOGR("ne_10m_land.shp") %>%
st_as_sf() %>%
st_transform(st_crs(si))
# p <- ... same as example and then:
p <- st_transform(p, st_crs(si))
# get a bounding box for the points to crop si & world.
p_bbox <- st_bbox(p) %>%
st_as_sfc() %>%
st_as_sf() %>%
st_buffer(100000)
# crop si & world_sf to an area around the points (p)
world_cropped <- st_crop(world_sf, p_bbox)
si_cropped <- st_crop(si, p_bbox)
#Plot
ggplot() +
geom_sf(data = world_cropped,
color = 'black',
fill = 'NA',
size = .2) +
geom_stars(data = si_cropped) +
geom_sf(data = p, color = 'red') +
scale_fill_continuous(na.value = 0)
Ugly hack for the southern .tif that stars reads as factors:
si <- stars::read_stars('20150324_hvsouth_rfi_l1c.tif', NA_value = 0 )
si$"20150324_hvsouth_rfi_l1c.tif" <- as.numeric(si$"20150324_hvsouth_rfi_l1c.tif")
ggplot() + geom_stars(data = si)