How to truly calculate a spherical voronoi diagram using sf? - r

I want to make a world map with a voronoi tessellation using the spherical nature of the world (not a projection of it), similar to this using D3.js, but with R.
As I understand ("Goodbye flat Earth, welcome S2 spherical geometry") the sf package is now fully based on the s2 package and should perform as I needed. But I don't think that I am getting the results as expected. A reproducible example:
library(tidyverse)
library(sf)
library(rnaturalearth)
library(tidygeocoder)
# just to be sure
sf::sf_use_s2(TRUE)
# download map
world_map <- rnaturalearth::ne_countries(
scale = 'small',
type = 'map_units',
returnclass = 'sf')
# addresses that you want to find lat long and to become centroids of the voronoi tessellation
addresses <- tribble(
~addr,
"Juneau, Alaska" ,
"Saint Petersburg, Russia" ,
"Melbourne, Australia"
)
# retrive lat long using tidygeocoder
points <- addresses %>%
tidygeocoder::geocode(addr, method = 'osm')
# Transform lat long in a single geometry point and join with sf-base of the world
points <- points %>%
dplyr::rowwise() %>%
dplyr::mutate(point = list(sf::st_point(c(long, lat)))) %>%
sf::st_as_sf() %>%
sf::st_set_crs(4326)
# voronoi tessellation
voronoi <- sf::st_voronoi(sf::st_union( points ) ) %>%
sf::st_as_sf() %>%
sf::st_set_crs(4326)
# plot
ggplot2::ggplot() +
geom_sf(data = world_map,
mapping = aes(geometry = geometry),
fill = "gray95") +
geom_sf(data = points,
mapping = aes(geometry = point),
colour = "red") +
geom_sf(data = voronoi,
mapping = aes(geometry = x),
colour = "red",
alpha = 0.5)
The whole Antarctica should be closer to Melbourne than to the other two points. What am I missing here? How to calculate a voronoi on a sphere using sf?

(This answer doesn't tell you how to do it, but does tell you what's going wrong.)
When I ran this code I got
Warning message:
In st_voronoi.sfc(sf::st_union(points)) :
st_voronoi does not correctly triangulate longitude/latitude data
From digging into the code it looks like this is a known limitation. Looking at the C++ code for CPL_geos_voronoi, it looks like it directly calls a GEOS method for building Voronoi diagrams. It might be worth opening an sf issue to indicate that this is a feature you would value (if no-one tells the developer that particular features would be useful, they don't get prioritized ...) It doesn't surprise me that GEOS doesn't automatically do computations that account for spherical geometry. Although the S2 code base mentions Voronoi diagrams in a variety of places, it doesn't look like there is a drop-in replacement for the GEOS algorithm ... there are a variety of implementations in other languages for spherical Voronoi diagrams (e.g. Python), but someone would probably have to port them to R (or C++) ...
If I really needed to do this I would probably try to figure out how to call the Python code from within R (exporting the data from sf format to whatever Python needs, then re-importing the results into an appropriate sf format ...)
Printing the code for sf:::st_voronoi.sfc:
function (x, envelope = st_polygon(), dTolerance = 0, bOnlyEdges = FALSE)
{
if (compareVersion(CPL_geos_version(), "3.5.0") > -1) {
if (isTRUE(st_is_longlat(x)))
warning("st_voronoi does not correctly triangulate longitude/latitude data")
st_sfc(CPL_geos_voronoi(x, st_sfc(envelope), dTolerance = dTolerance,
bOnlyEdges = as.integer(bOnlyEdges)))
}
else stop("for voronoi, GEOS version 3.5.0 or higher is required")
}
In other words, if the GEOS version is less than 3.5.0, the operation fails completely. If it is >= 3.5.0 (sf:::CPL_geos_version() reports that I have version 3.8.1), and long-lat data is being used, a warning is supposed to be issued (but the computation is done anyway).
The first time I ran this I didn't get the warning; I checked and options("warn") was set to -1 (suppressing warnings). I'm not sure why — running from a clean session did give me the warning. Maybe something in the pipeline (e.g. rnaturalearth telling me I needed to install the rnaturalearthdata package) accidentally set the option?

Related

SF: How can I get all the points that lay within a polygon? [duplicate]

I have a shapefile (with several polygons) and a dataframe with coordinates. I want to assign each coordinate in a dataframe to a polygon in a shapefile. So to add a column in a data frame with a polygon name or id
Here is the link to the data
library(sf)
library(readr)
shape <- read_sf("data/Provinces_v1_2017.shp")
data<- read_csv("data/data.csv")
But when I try to join them, I always get the error
pts = st_as_sf(data, coords = c("dec_lon", "dec_lat"), crs= 4326)
st_join(pts, shape)
i tried over() functions, and other tricks like st_make_valid() but I always get this error:
Error in s2_geography_from_wkb(x, oriented = oriented, check = check) : Evaluation error: Found 30 features with invalid spherical geometry.
It is a recent issue (before my code worked), but now I am unable to use sf package to do this task, I always end up with this error. I updated the libraries to see whether it would help, but I could not make it work.
I would really appreciate your help on this matter
You have two options:
turn off the s2 processing via sf::sf_use_s2(FALSE) in your script; in theory the behaviour should revert to the one before release 1.0
repair the spherical geometry of your polygons object; this will depend on the actual nature of your errors.
I can't access your file & make certain, but this piece of code has helped me in the past:
yer_object$geometry <- yer_object$geometry %>%
s2::s2_rebuild() %>%
sf::st_as_sfc()
I find that this 'invalid spherical geometry' does keep on popping up. If the s2::s2_rebuild() solution above doesn't work, a solution which usually works for me involves projecting and simplifying (reducing the map resolution a little). If your application can work with less resolution try this.
library(tidyverse)
library(sf)
crs_N = 3995 #northern polar projection
# example of FAILING map - with bad spherical geometry.
m_RU <- rnaturalearthdata::countries50 %>%
st_as_sf() %>%
filter((admin %in% c("Russia") )) |>
st_as_s2()
In the example, I chose Russia because it crosses the dateline, which can be one of the challenges. I switch to an Arctic polar projection, and reduce the map to 10km resolution (5km is not enough in this case!).
# with 2 extra lines the problem is gone
m_RU <- rnaturalearthdata::countries50 %>%
st_as_sf() %>%
filter((admin %in% c("Russia") )) |>
st_transform(crs = crs_N) |>
st_simplify(dTolerance = 10000) |> # to get rid of duplicate vertex (reduce to 10km steps)
st_as_s2()

How to add location points onto a shapefile imported into R from QGIS using ggplot2

I'm working on a project which involves GPS coordinates from offshore locations. I'm looking to measure the distance from shore for each of my points. I have created a shapefile of the shoreline in question in QGIS and I have successfully imported it into R using the st_read() function (named "biminishore" in this example).
With the following code, I'm able to plot my shapefile in ggplot2:
bplot = ggplot() +
geom_sf(data = biminishore, size = 0.1, color = "black", fill = "green1") +
ggtitle("Bimini, The Bahamas") +
coord_sf() +
theme_classic()
plot(bplot)
Now, I would like to add the location coordinates (imported into R as a .csv with separate columns for Lat and Lon) as a layer over the imported shapefile. Can anyone suggest how to go about doing this in a way that will allow me to calculate the distance between each point and the nearest shoreline point?
My currents attempts are giving the error: Error in st_transform.sfc(st_geometry(x), crs, ...) : cannot transform sfc object with missing crs
I assume this means my coordinate systems are incompatible but haven't found a way around this yet. So far, I have tried combining my point columns using SpatialPoints(). I've also tried using multiple forms of st_set_crs() and st_transform() but I haven't had any luck yet. Any help is greatly appreciated! Thanks!
Read your points file as a csv & then transform it to an sf object:
library(tidyverse)
library(sf)
points <- read_csv('path_to_points.csv')
#make it an sf object, change Long and Lat to the correct column name
points_sf <- st_as_sf(points, coords = c("Long", "Lat"))
# set crs of points_sf to same as biminishore object
points_sf <- st_set_crs(points_sf, st_crs(biminishore))
Then you should be able to plot them together by adding:
+ geom_sf(data = points_sf)
to your ggplot2 call.
Finding the nearest feature between the two can be done with sf::st_nearest_feature(points_sf, biminishore).
A good post on nearest features & distances: https://gis.stackexchange.com/questions/349955/getting-a-new-column-with-distance-to-the-nearest-feature-in-r

How to determine if a point lies within an sf geometry that spans the dateline?

Using the R package sf, I'm trying to determine whether some points occur within the bounds of a shapefile (in this case, Hawai‘i's, EEZ). The shapefile in question can be found here. Unfortunately, the boundaries of the area in question span +/-180 longitude, which I think is what's messing me up. (I read on the sf website some business about spherical geometry in the new version, but I haven't been able to get that version to install. I think the polygons I'm dealing with are sufficiently "flat" to avoid any of those issues anyway). Part of the issue seems to be that my shapefile contains multiple geometries broken up by the dateline but I'm not sure how to combine them.
How do you tell, using sf, whether some points are inside of the bounds of some object in a shapefile (that happens to span the dateline)?
I have tried various combinations of st_shift_longitude to no avail. I have also tried transforming to what I think is a planar projection (2163), and that didn't work.
Here's how I'm currently trying to do this:
library(sf)
library(maps)
library(ggplot2)
library(tidyverse)
# this is the shapefile from the link above
eez_unshifted <- read_sf("USMaritimeLimitsAndBoundariesSHP/USMaritimeLimitsNBoundaries.shp") %>%
filter(OBJECTID == 1206) %>%
st_transform(4326)
eez_shifted <- read_sf("USMaritimeLimitsAndBoundariesSHP/USMaritimeLimitsNBoundaries.shp") %>%
filter(OBJECTID == 1206) %>%
st_transform(4326) %>%
st_shift_longitude()
# four points, in and out of the geometry, on either side of the dateline
pnts <- tibble(x=c(-171.952474,176.251978,179.006220,-167.922929),y=c(25.561970,17.442716,28.463375,15.991429)) %>%
st_as_sf(coords=c('x','y'),crs=st_crs(eez_unshifted))
# these all return false for every point
st_within(pnts,eez_unshifted)
st_within(st_shift_longitude(pnts),eez_unshifted)
st_within(pnts,eez_shifted)
st_within(st_shift_longitude(pnts),eez_shifted)
# these also all return false for every point
st_intersects(pnts,eez_unshifted)
st_intersects(st_shift_longitude(pnts),eez_unshifted)
st_intersects(pnts,eez_shifted)
st_intersects(st_shift_longitude(pnts),eez_shifted)
# plot the data just to show that it looks right
wrld2 <- st_as_sf(maps::map('world2', plot=F, fill=T))
ggplot() +
geom_sf(data=wrld2, fill='gray20',color="lightgrey",size=0.07) +
geom_sf(data=eez_shifted) +
geom_sf(data=st_shift_longitude(pnts)) +
coord_sf(xlim=c(100,290), ylim=c(-60,60)) +
xlab("Longitude") +
ylab("Latitude")
The answer is to make sure the geometry you're checking against is a polygon:
> eez_poly <- st_polygonize(eez_shifted)
> st_within(pnts,eez_poly)
although coordinates are longitude/latitude, st_within assumes that they are planar
Sparse geometry binary predicate list of length 4, where the predicate was `within'
1: 1
2: (empty)
3: 1
4: (empty)

Project longitude and latitude to planar coordinates system

Disclaimer: I am currently using the ppp branch version of the {sf} package, because new features for converting objects between {sf} and {spatstat} are available in it (see https://github.com/r-spatial/sf/issues/1233). For it to work properly, I had to manually delete the {sf} package from my hard drive and then reinstall it from Github. I am also using the development version of {spatstat} for no particular reason.
# install.packages("remotes")
install_github("r-spatial/sf#ppp")
install_github("spatstat/spatstat")
I have two geospatial objects: area_one, which is the union of the polygons of several counties in Texas and vz, which is the point locations of several stores in Texas and they are both objects of the sf family. vz was created using longitude and latitude coordinates scraped from the internet. My goal is to create a ppp object with the locations in vz as the points and the polygon in area_one as the window. The issue is that I cannot find the correct coordinate reference system (CRS), for my points to lie inside the polygon. I get an error telling me that the points lie outside the window. Here are the two files to make the code below reproducible:
area_one: download here
vz: download here
# Load packages
library(sf) # Development version in the ppp branch
library(spatstat) # Development version in the master branch
library(tmap)
library(here)
# Read the geospatial data (CRS = 4326)
area_one <- st_read(dsn = here("area_one/area_one.shp"), layer = "area_one")
vz <- st_read(dsn = here("vz/vz.shp"), layer = "vz")
# Plot a quick map
tm_shape(area_one) +
tm_borders() +
tm_shape(vz) +
tm_dots(col = "red", size = 0.1)
# Create a planar point pattern
vz_lonlat <- st_coordinates(vz)
area_one_flat <- st_transform(area_one, crs = 6345)
p <- ppp(x = vz_lonlat[, "X"], y = vz_lonlat[, "Y"], window = as.owin(area_one_flat)) # Note that the reason why this line of code does not throw an error (especially despite the window argument being an sf object) is because of the version of the {sf} package I am using
Warning message:
49 points were rejected as lying outside the specified window
plot(p)
As #spacedman points out you should first transform vz to the same coordinate system as the observation region. I guess you could do something like (untested):
vz_flat <- st_coordinates(st_transform(vz, crs = 6345))
area_one_flat <- st_transform(area_one, crs = 6345)
p <- ppp(x = vz_flat[, "X"], y = vz_flat[, "Y"], window = as.owin(area_one_flat))

Incorrect NA return when converting Lat/Long Coordinates to location in R

I am trying to use a modified version of the R code found in the following link:
Latitude Longitude Coordinates to State Code in R
To test the code, I created the following formal arguments:
mapping = "state"
pointsDF = data.frame(x = c(-88.04607, -83.03579), y = c(42.06907, 42.32983))
latlong2state(pointsDF, mapping)
The code returned the following:
[1] "Illinois" NA
The first coordinate set returns a correct answer, i.e. "Illinois". However, when I input the 2nd coordinate set (i.e. -83.03579, 42.32983) into an online converter, I get the following:
Downtown, Detroit, MI, USA
(http://www.latlong.net/Show-Latitude-Longitude.html)
Running the code again but changing the second coordinate from 42.32983 to 43.33 puts the point in the state of Michigan.
When using the "world" map as my formal argument for the "mapping" variable, the code returns "USA". I have been struggling for days to figure this out and have had no luck. I have played around with SpatialPointDataFrames, various projections, and looked into the state polygon objects themselves. I am using R version 3.3.1 on a Windows 7 system. I think the data point in question may be falling on a border line. In which case, I think an "NA" would be expected. The code I used is below.
Code Used:
library(sp)
library(maps)
library(maptools)
library(rgdal)
latlong2state = function(pointsDF, mapping) {
local.map = map(database = mapping, fill = TRUE, col = "transparent", plot = FALSE)
IDs = sapply(strsplit(local.map$names, ":"), function(x) x[1])
maps_sp = map2SpatialPolygons(map = local.map, ID = IDs,
proj4string = CRS("+proj=longlat +datum=WGS84"))
pointsSP = SpatialPoints(pointsDF,
proj4string = CRS("+proj=longlat +datum=WGS84"))
indices = over(x = pointsSP, y = maps_sp)
mapNames = sapply(maps_sp#polygons, function(x) {x#ID})
mapNames[indices]
}
I am only two months in to learning R and love the language thus far. This has been the first time I could not find an answer. I would really appreciate an help provided on the matter!!!
Firstly, the issue is not due to the point lying on a border. In fact, over() would not return NA for a point on a border, but rather "if a point falls in multiple polygons, the last polygon is recorded."
NA denotes a point that does not fall in a polygon. We can zoom in on your map to see this is the case
plot(local.map, xlim = c(-83.2, -82.8), ylim=c(42.2,42.6), type="l")
polygon(local.map, col="grey60")
points(local.map)
points(pointsDF[2,], col="red")
The point falls outside the contiguous USA in Canada, according to the polygons provided by maps::map(). Why would this be the case when other maps, as you say, locate this point on the USA side of the border? I do not think this is a projection issue, because we are using the same WGS84 geographic coordinates for the polygons and the points. It seems, therefore, that the polygons themselves that are provided by maps::map() may be wrong.
We can check this by comparing to polygons from another source. I downloaded the US census departments highest resolution state boundaries from http://www2.census.gov/geo/tiger/GENZ2015/shp/cb_2015_us_state_500k.zip. Then,
shp.path <- "C:/Users/xxx/Downloads/cb_2015_us_state_500k/cb_2015_us_state_500k.shp"
states <- readOGR(path.expand(shp.path), "cb_2015_us_state_500k")
plot(states, xlim = c(-83.2, -82.8), ylim=c(42.2,42.6))
points(pointsDF[2,], col="red")
gets us this map in which we see that the point is inside the US boundary:
The solution I recommend therefore, is to use these better resolution, more reliable boundary polygons, particularly if you are interested to accurately resolve points close to borders.

Resources