I'm working with a dataframe containing longitude and latitude for each point. I have a shapefile containing mutually exclusive polygons. I would like to find the index of the polygon it where each point is contained. Is there a specific function that helps me achieve this? I've been trying with the sf package, but I'm open to doing it with another one. Any help is greatly appreciated.
I believe you may be looking for function sf::st_intersects() - in combination with sparse = TRUE setting it returns a list, which can be in this use case (points & a set of non-overlapping polygons) converted to a vector easily.
Consider this example, built on the North Carolina shapefile shipped with {sf}
library(sf)
# as shapefile included with sf package
shape <- st_read(system.file("shape/nc.shp", package="sf")) %>%
st_transform(4326) # WGS84 is a good default
# three semi random cities
cities <- data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326) # again, WGS84
# plot cities on full map
plot(st_geometry(shape))
plot(cities, add = T, pch = 4)
# this is your index
index_of_intersection <- st_intersects(cities, shape, sparse = T) %>%
as.numeric()
# plot on subsetted map to doublecheck
plot(st_geometry(shape[index_of_intersection, ]))
plot(cities, add = T, pch = 4)
Related
I am trying to find out the geographical area that is equidistant from two points, and to plot this as an ellipse.
I can produce plots for one point easily using st_buffer, and can find numerous R functions that will plot ellipse from a known centroid if I define the axis, but have not been able to find one that will plot an ellipse given two known foci and a defined distance.
The similar question here gets some way towards an answer, but is not readily applicable to geographic situations - Draw an ellipse based on its foci
My code is pretty simple at the moment, and given each coordinate with a 100km radius. However, I would like to find out all the positions that would be reachable by a 200km (or other defined distance) trip between both sites.
library(tidyverse)
library(sf)
#Give Coordinates
citylocations <- tibble::tribble(
~city, ~lon, ~lat,
"London", -0.1276, 51.5072,
"Birmingham", -1.8904, 52.4862,
)
citydflocations <- as.data.frame(citylocations)
#Convert to SF
citysflocations <- sf::st_as_sf(citydflocations, coords = c("lon","lat" ), crs = 4326)
#Convert location file to National Grid Planar
cityBNGsflocations <- citysflocations %>%
st_transform(citysflocations, crs = 27700)
#Produce circles with 100km buffer
dat_circles <- st_buffer(cityBNGsflocations, dist = 100000)
join_circles <- st_union(dat_circles) %>%
st_transform(4326)
plot(join_circles, col = 'lightblue')```
The function below should create buffers of varying distances for each of the two points it is given, finds the intersection the two buffers, unions the intersections, and finally returns a convex hull of those intersections. The output should be a near approximation of an ellipse with the two points as foci.
The straight-line(s) distance from one city to any edge of the polygon and then to the other city should equal the distance given in the function (200,000m in the example below).
It works on the data provided, but is fragile as there's no error checking or warning suppression. Make sure the dist argument is greater than the distance between the two points, and that the points have a crs that can use meters as a distance. (lat/lon might not work)
The example below only uses 20 points for the 'ellipse', but changing the function should be relatively straightforward.
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1; sf_use_s2() is TRUE
library(tidyverse)
#Give Coordinates
citylocations <- tibble::tribble(
~city, ~lon, ~lat,
"London", -0.1276, 51.5072,
"Birmingham", -1.8904, 52.4862,
)
citydflocations <- as.data.frame(citylocations)
#Convert to SF
citysflocations <- sf::st_as_sf(citydflocations, coords = c("lon","lat" ), crs = 4326)
#Convert location file to National Grid Planar
cityBNGsflocations <- citysflocations %>%
st_transform(citysflocations, crs = 27700)
#Produce circles with 100km buffer
dat_circles <- st_buffer(cityBNGsflocations, dist = 100000)
join_circles <- st_union(dat_circles) %>%
st_transform(4326)
#plot(join_circles, col = 'lightblue')
### the ellipse function using 20 buffers ####
ellipse_fn <- function(x_sf, y_sf, distance){
#set distance argument to meters, get sequence of distances for buffers
distance = units::set_units(distance, 'm')
dists_1 <- seq(units::set_units(0, 'm'), distance, length.out = 22)
# create empty sf object to place for loop objects in
# purrr would probably be better here
nrows <- 20
df <- st_sf(city = rep(NA, nrows), city.1 = rep(NA, nrows), geometry = st_sfc(lapply(1:nrows, function(x) st_geometrycollection())))
intersections <- for(i in 2:21){
buff_1 <- st_buffer(cityBNGsflocations[1,], dist = dists_1[i])
buff_2 <- st_buffer(cityBNGsflocations[2,], dist = distance - dists_1[i])
intersection <- st_intersection(buff_1, buff_2)
df[i-1,] <- intersection
}
df %>%
st_set_crs(st_crs(x_sf)) %>%
st_union() %>%
st_convex_hull()
}
### end ellipse function ###
# Using the ellipse function with 2 points & 200000m distance
ellipse_sf <- ellipse_fn(cityBNGsflocations[1,], cityBNGsflocations[2,], dist = 200000)
# You'll get lots of warnings here about spatial constance...
ggplot() +
geom_sf(data = ellipse_sf, fill = 'black', alpha = .2) +
geom_sf(data = cityBNGsflocations, color = 'red')
Created on 2022-06-03 by the reprex package (v2.0.1)
mapview plot of the cities & 'ellipse' on a map:
Let's say I have the below data along with the code. The code returns point data but I want a polygon.
How can I do a spatial join such that it returns a polygon with both the point and polygon attributes? (Basically, the data will be matched/joined based on the points that fall with in the polygon)
Code + Data
library(sf)
library(tidyverse)
# Sample poly
poly = st_read(system.file("shape/nc.shp", package="sf")) # included with sf package
# Sample points
pts = data.frame(name = c("Raleigh", "Greensboro", "Wilmington"),
x = c(-78.633333, -79.819444, -77.912222),
y = c(35.766667, 36.08, 34.223333)) %>%
st_as_sf(coords = c("x", "y"), crs = 4326) %>%
st_transform(st_crs(poly))
# Spatial join and output a polygon with the joined attributes, stuck here....
cities_with_counties = st_join(pts,
poly)
The geometry type returned by sf::st_join() is driven by the functions first argument.
Consider flipping the two - st_join(poly, pts).
The difference in output should be only in geometry type (and ordering of columns).
I would like to create evenly spaced polylines going North to South with 50 mile spacing between each line and 10 miles long. Not sure if this is possible using sf package. In the example below, I would like to have the lines filling the counties across the state of Washington.
library(tigris)
library(leaflet)
states <- states(cb = TRUE)
counties<-counties(cb=TRUE)
counties<- counties%>%filter(STATEFP==53)
states<- states%>%filter(NAME=="Washington")
leaflet(states) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(fillColor = "white",
color = "black",
weight = 0.5) %>%
addPolygons(data=counties,color='red',fillColor = 'white')%>%
setView(-120.5, 47.3, zoom=8)
I've updated to include an image of what I'd like to do below.
You can create a multilinestring sf object from scratch by specifying coordinates.
You can get these coordinates from the extent (bounding box) of Washington, but you may also be interested in knowing how to create a grid, which I will demonstrate below because it may be helpful.
Copy and paste this reproducible example:
library(tidyverse)
library(tigris)
library(leaflet)
library(sf)
library(raster)
states <- states(cb = TRUE)
# subset for WA and transform to a meter-based CRS
states <- states %>%
filter(NAME == "Washington") %>%
st_transform(crs = 3857) # Mercator
# fifty miles in meters
fm <- 80467.2
# subset for Washington
states_sp <- as(states, "Spatial")
# create a grid, convert it to polygons to plot
grid <- raster(extent(states_sp),
resolution = c(fm, fm),
crs = proj4string(states_sp))
grid <- rasterToPolygons(grid)
plot(states_sp)
plot(grid, add = TRUE)
# find the top y coordinate and calculate 50 mile intervals moving south
ty <- extent(grid)[4] # y coordinate along northern WA edge
ty <- ty - (fm * 0:7) # y coordinates moving south at 10 mile intervals
# create a list of sf linestring objects
l <- vector("list", length(ty))
for(i in seq_along(l)){
l[[i]] <-
st_linestring(
rbind(
c(extent(grid)[1], ty[i]),
c(extent(grid)[2], ty[i])
)
)
}
# create the multilinestring, which expects a list of linestrings
ml <- st_multilinestring(l)
plot(states_sp)
plot(as(ml, "Spatial"), add = TRUE, col = "red")
As you can see, I switch back and forth between sf and sp objects using the functions as(sf_object, "Spatial") and st_as_sf(sp_object). Use these to transform the data to your needs.
I have a dataframe of points on map and an area of interest described as a polygon of points. I want to calculate the distance between each of the points to the polygon, ideally using the sf package.
library("tidyverse")
library("sf")
# area of interest
area <-
"POLYGON ((121863.900623145 486546.136633659, 121830.369032584 486624.24942906, 121742.202408334 486680.476675484, 121626.493982203 486692.384434804, 121415.359596921 486693.816446951, 121116.219703244 486773.748535465, 120965.69439283 486674.642759986, 121168.798757601 486495.217550029, 121542.879304342 486414.780364836, 121870.487595417 486512.71203006, 121863.900623145 486546.136633659))"
# convert to sf and project on a projected coord system
area <- st_as_sfc(area, crs = 7415L)
# points with long/lat coords
pnts <-
data.frame(
id = 1:3,
long = c(4.85558, 4.89904, 4.91073),
lat = c(52.39707, 52.36612, 52.36255)
)
# convert to sf with the same crs
pnts_sf <- st_as_sf(pnts, crs = 7415L, coords = c("long", "lat"))
# check if crs are equal
all.equal(st_crs(pnts_sf),st_crs(area))
I am wondering why the following approaches do not give me the correct answer.
1.Simply using the st_distance fun-doesn't work, wrong answer
st_distance(pnts_sf, area)
2.In a mutate call - all wrong answers
pnts_sf %>%
mutate(
distance = st_distance(area, by_element = TRUE),
distance2 = st_distance(area, by_element = FALSE),
distance3 = st_distance(geometry, area, by_element = TRUE)
)
However this approach seems to work and gives correct distances.
3.map over the long/lat - works correctly
pnts_geoms <-
map2(
pnts$long,
pnts$lat,
~ st_sfc(st_point(c(.x, .y)) , crs = 4326L)
) %>%
map(st_transform, crs = 7415L)
map_dbl(pnts_geoms, st_distance, y = area)
I'm new to spatial data and I'm trying to learn the sf package so I'm wondering what is going wrong here. As far as i can tell, the first 2 approaches somehow end up considering the points "as a whole" (one of the points is inside the area polygon so i guess that's why one of the wrong answers is 0). The third approach is considering a point at a time which is my intention.
Any ideas how can i get the mutate call to work as well?
I'm on R 3.4.1 with
> packageVersion("dplyr")
[1] ‘0.7.3’
> packageVersion("sf")
[1] ‘0.5.5’
So it turns out that the whole confusion was caused by a small silly oversight on my part. Here's the breakdown:
The points dataframe comes from a different source (!) than the area polygon.
Overseeing this I kept trying to set them to crs 7415 which is a legal but incorrect move and led eventually to the wrong answers.
The right approach is to convert them to sf objects in the crs they originate from, transform them to the one the area object is in and then proceed to compute the distances.
Putting it all together:
# this part was wrong, crs was supposed to be the one they were
# originally coded in
pnts_sf <- st_as_sf(pnts, crs = 4326L, coords = c("long", "lat"))
# then apply the transformation to another crs
pnts_sf <- st_transform(pnts_sf, crs = 7415L)
st_distance(pnts_sf, area)
--------------------------
Units: m
[,1]
[1,] 3998.5701
[2,] 0.0000
[3,] 751.8097
The sf package provides a great approach to working with geographic features, but I can't figure out a simple equivalent to the poly.counts function from GISTools package which desires sp objects.
poly.counts computes the number of points from a SpatialPointsDataFrame fall within the polygons of a SpatialPolygonsDataFrame and can be used as follows:
Data
## Libraries
library("GISTools")
library("tidyverse")
library("sf")
library("sp")
library("rgdal")
## Obtain shapefiles
download.file(url = "https://www2.census.gov/geo/tiger/TIGER2016/STATE/tl_2016_us_state.zip", destfile = "data-raw/states.zip")
unzip(zipfile = "data-raw/states.zip", exdir = "data-raw/states")
sf_us_states <- read_sf("data-raw/states")
## Our observations:
observations_tibble <- tribble(
~lat, ~long,
31.968599, -99.901813,
35.263266, -80.854385,
35.149534, -90.04898,
41.897547, -84.037166,
34.596759, -86.965563,
42.652579, -73.756232,
43.670406, -93.575858
)
Calculate points per polygon
I generate both my sp objects:
sp_us_states <- as(sf_us_states, "Spatial")
observations_spdf <- observations_tibble %>%
select(long, lat) %>% # SPDF want long, lat pairs
SpatialPointsDataFrame(coords = .,
data = .,
proj4string = sp_us_states#proj4string)
Now I can use poly.counts
points_in_states <-
poly.counts(pts = observations_spdf, polys = sp_us_states)
Add this into the sp object:
sp_us_states$points.in.state <- points_in_states
Now I've finished I'd convert back to sf objects and could visualise as follows:
library("leaflet")
updated_sf <- st_as_sf(sp_us_states)
updated_sf %>%
filter(points.in.state > 0) %>%
leaflet() %>%
addPolygons() %>%
addCircleMarkers(
data = observations_tibble
)
Question
Can I perform this operation without tedious conversion between sf and sp objects?
Try the following:
sf_obs = st_as_sf(observations_tibble, coords = c("long", "lat"),
crs = st_crs(sf_us_states))
lengths(st_covers(sf_us_states, sf_obs))
# check:
summary(points_in_states - lengths(st_covers(sf_us_states, sf_obs)))
st_covers returns a list with the indexes of points covered by each state; lengths returns the vector of the lenghts of these vectors, or the point count. The warnings you'll see indicate that although you have geographic coordinates, the underlying software assumes they are cartesian (which, for this case, will be most likely not problematic; move to projected coordinates if you want to get rid of it the proper way)