Subset class sfc_LINESTRING & sfc objects within a bbox - r

Example:
bbox <- c(-0.1178, 51.4232, -0.0185, 51.5147) # I know it needs to be sf df object
# we have
df
#> Geometry set for 300 features
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: -0.113894 ymin: 51.49739 xmax: -0.0764779 ymax: 51.59839
#> epsg (SRID): 4326
#> proj4string: +proj=longlat +datum=WGS84 +no_defs
#> LINESTRING (-0.113894 51.50631, -0.1135137 51.5...
#> LINESTRING (-0.0767875 51.59837, -0.0764779 51....
#> ....
How can I do something like
df[bbox]
and keep the linestrings which are within the bbox. Thanks.

Here's an example using an sf object from tigris, just for reproducibility. I'm using towns in New Haven County, Connecticut, plotting it the way it comes in. Then I crop it to a bounding box I made up, using st_crop, which I believe was added fairly recently to sf. If I had the bbox as a shape, instead of a vector of coordinates, I could have used st_intersection.
I don't have a linestring object handy, but I'd assume it works the same way.
library(tidyverse)
library(sf)
# selecting just to limit the amount of data in my sf
ct_sf <- tigris::county_subdivisions(state = "09", county = "09", cb = T, class = "sf") %>%
select(NAME, geometry)
plot(ct_sf)
crop_bbox <- c(xmin = -73, ymin = 41.2, xmax = -72.7, ymax = 41.5)
ct_cropped <- st_crop(ct_sf, crop_bbox)
plot(ct_cropped)

Related

Transforming from EPSG:4326 to EPSG:3857 massively inflates longitude and latitude numbers

I've downloaded a world map and want to change it from the default CRS (EPSG:4326) to the WGS 84 / Pseudo-Mercator projection used in applications like Tableau and Google Maps (EPSG:3857). For some reason, when I attempt the transformation, the longitude and latitude numbers become inflated.
For example, (-95.160, -95.102) becomes (-10593226.108, -10586797.584)
library(rnaturalearth)
library(dplyr)
library(sf)
target_crs <- st_crs(3857)
# Use the United States as an example
US <- ne_countries(scale = 10, type = "countries", returnclass = "sf") %>%
filter(admin == "United States of America") %>%
select(admin, geometry)
head(US)
#Simple feature collection with 1 feature and 1 field
#Geometry type: MULTIPOLYGON
#Dimension: XY
#Bounding box: xmin: -179.1435 ymin: 18.90612 xmax: 179.7809 ymax: 71.4125
#CRS: +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
# admin geometry
#1 United States of America MULTIPOLYGON (((-95.16057 4...
# Now attempt to transform the CRS
US <- st_transform(US, target_crs)
head(US)
#Simple feature collection with 1 feature and 1 field
#Geometry type: MULTIPOLYGON
#Dimension: XY
#Bounding box: xmin: -19942160 ymin: 2143886 xmax: 20013120 ymax: 11544810
#Projected CRS: WGS 84 / Pseudo-Mercator
# admin geometry
#1 United States of America MULTIPOLYGON (((-10593226 6...
Notice the massive change in the xmin, ymin, xmax, ymax values. I'm not sure what's causing this.
3857 is in meter and 4326 are in degree. Are you close to peter island ?
You can test your coords here if needed : https://epsg.io/map#srs=3857&x=-10093891.770199&y=-10709420.049722&z=12&layer=streets

How to build Voronoi Polygons for point coordinates in R?

I have various points (2000+) for observation stations in the alps. I would like to use them to represent the closest geographic area, that is not closer to another observation station. I have done some research, and think that using Varanoi polygons may be the best way to do this.
After having attempting to build these in R, the polygon plot does not quite match my graphing in R.
I have attached the sample data points I am experimenting with, as well as the two images that show the dissimilar graphing of the points.
What do I need to be do differently to make sure that the points line up?
Points:
Longitude:
15.976667 12.846389 14.457222 13.795556 9.849167 16.055278 13.950833 15.666111 9.654722 15.596389 13.226667 15.106667 13.760000 12.226111 9.612222 17.025278 9.877500 15.368056 13.423056 12.571111 16.842222 13.711667 14.003056 12.308056 13.536389
Latitude:
48.40167 48.14889 47.56778 46.72750 47.45833 48.04472 47.82389 47.49472 47.35917 48.64917 48.25000 48.87139 47.87444 47.42806 47.20833 47.77556 47.40389 47.87583 47.53750 46.77694 47.74250 46.55000 48.37611 47.38333 47.91833
Pictures:
Map of the 25 sample points in Leaflet:
Voronoi plot:
Clearly these two are not the same images, so I must be doing something wrong. Here's the code I'm using to generate the Voronoi plot and the leaflet map.
meta25%>%
st_as_sf(coords = c("Longitude", "Latitude"),
crs = sp::CRS("+proj=longlat +datum=WGS84")) %>%
mapview()
m1 = matrix(meta25$Longitude,meta25$Latitude,ncol=2,nrow=25) %>% st_multipoint()
voronoi_grid <- st_voronoi(m1)
plot(voronoi_grid, col = NA)
plot(m1, add = TRUE, col = "blue", pch = 16)
I'm not sure what the problem is, but the matrix is not necessary. Stick to sf objects and you should be fine.
library(tidyverse)
library(sf)
# create pts from lat & lon data
pts <- tibble(latitude = y, longitude = x) %>%
st_as_sf(coords = c('latitude', 'longitude')) %>%
st_set_crs(4326)
# voronoi of pts
vor <- st_voronoi(st_combine(pts))
head(vor)
#> Geometry set for 1 feature
#> Geometry type: GEOMETRYCOLLECTION
#> Dimension: XY
#> Bounding box: xmin: 2.199166 ymin: 39.13694 xmax: 24.43833 ymax: 56.28445
#> Geodetic CRS: WGS 84
#> GEOMETRYCOLLECTION (POLYGON ((2.199166 49.37841...
# st_voronoi returns a GEOMETRYCOLLECTION,
# some plotting methods can't use a GEOMETRYCOLLECTION.
# this returns polygons instead
vor_poly <- st_collection_extract(vor)
head(vor_poly)
#> Geometry set for 6 features
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 2.199166 ymin: 39.13694 xmax: 18.32787 ymax: 56.28445
#> Geodetic CRS: WGS 84
#> First 5 geometries:
#> POLYGON ((2.199166 49.37841, 2.199166 56.28445,...
#> POLYGON ((9.946349 39.13694, 2.199166 39.13694,...
#> POLYGON ((18.32787 39.13694, 11.64381 39.13694,...
#> POLYGON ((9.794868 47.23828, 9.766296 47.38061,...
#> POLYGON ((5.225657 56.28445, 9.393793 56.28445,...
plot(pts, col = 'blue', pch = 16)
plot(vor_poly, add = T, fill = NA)
Created on 2021-04-05 by the reprex package (v0.3.0)
Thanks everyone for your help, not sure if it got quite to where I was looking for. I've since adapted the answer from here: Creating bordering polygons from spatial point data for plotting in leaflet

Create new geometry on grouped column in R sf

I'd like to create a new shapefile or a new geometry variable that allows me to plot borders around regions in R. I'm using the sf and mapping with tmap. Basically, I'm adding a character vector to an sf object and would like to make the character vector the new/preferred mapping border.
Here is an example of my approach, which doesn't do what I'd like. I can't tell that it does anything.
library(tidyverse)
library(sf)
library(tmap)
## use North Carolina example
nc = st_read(system.file("shape/nc.shp", package="sf"))
nc_new.region <- nc %>% ## add new region variable
mutate(new.region = sample(c('A', 'B', 'C'), nrow(.),replace = T))
nc_union <- nc_new.region %>%
group_by(new.region) %>% # group by the new character vector
mutate(new_geometry = st_union(geometry)) # union on the geometry variable
# map with tmap package
tm_shape(nc_union)+
tm_borders()
This happens because mutate(new_geometry = st_union(geometry)) creates a "new" column within the original sf object, but plotting still uses the "original" geometry column. Indeed, if you have a look at your nc_union object, you'll see that it still contains 100 features (therefore, no "dissolving" was really done).
To do what you wish, you should instead create a "new" sf object using summarize over the groups:
library(tidyverse)
library(sf)
library(tmap)
## use North Carolina example
nc = st_read(system.file("shape/nc.shp", package="sf"))
#> Reading layer `nc' from data source `D:\Documents\R\win-library\3.5\sf\shape\nc.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 100 features and 14 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> epsg (SRID): 4267
#> proj4string: +proj=longlat +datum=NAD27 +no_defs
nc_new.region <- nc %>% ## add new region variable
mutate(new.region = sample(c('A', 'B', 'C'), nrow(.),replace = T))
nc_union <- nc_new.region %>%
group_by(new.region) %>%
summarize()
> nc_union
Simple feature collection with 3 features and 1 field
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
epsg (SRID): 4267
proj4string: +proj=longlat +datum=NAD27 +no_defs
# A tibble: 3 x 2
new.region geometry
<chr> <MULTIPOLYGON [°]>
1 A (((-78.65572 33.94867, -79.0745 34.30457, -79.04095 34.3193, -79.02947 34.34737, -7~
2 B (((-79.45597 34.63409, -79.6675 34.80066, -79.68596 34.80526, -79.66015 34.8179, -7~
3 C (((-78.65572 33.94867, -78.63472 33.97798, -78.63027 34.0102, -78.58778 34.03061, -~
tm_shape(nc_union)+
tm_borders()
You can see that now nc_union contains only 3 MULTIPOLYGONS, and plot reflects the "aggregation".
See also: https://github.com/r-spatial/sf/issues/290
Created on 2019-08-23 by the reprex package (v0.3.0)

R, sf: intersect lines with the borders of multipolygons, extract coordinates of those intersections

I am newbie to SF and stack, hope my question is clear enough.
I was able to create a set of lines connecting 1 point to a set of points all over the US.
The I can read the US counties into multipolygons.
My goal is to find and geolocate all the points where the lines I created cross the county borders.
So far I was able to create the lines from the points:
points_to_lines <- dt %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
group_by(lineid) %>%
summarize(do_union = FALSE) %>% lineid
st_cast("LINESTRING")
This is the head of the lines
Simple feature collection with 1628 features and 1 field
geometry type: LINESTRING
dimension: XY
bbox: xmin: 30.1127 ymin: -91.32484 xmax: 37.23671 ymax: -82.31262
epsg (SRID): 4326
proj4string: +proj=longlat +datum=WGS84 +no_defs
# A tibble: 1,628 x 2
lineid geometry
<int> <LINESTRING [°]>
1 1 (33.51859 -86.81036, 36.16266 -86.7816)
2 2 (33.51859 -86.81036, 34.61845 -82.47791)
This is the head of the county dataset.
Reading layer `US_county_1930_conflated' from data source `~/county_gis/1930' using driver `ESRI Shapefile'
Simple feature collection with 3110 features and 18 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -7115608 ymin: -1337505 xmax: 2258244 ymax: 4591848
epsg (SRID): NA
proj4string: +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs
Very naively I have tried to give them both the same set of coordinates, and then st_intersects them. The non sparse matrix seams to say that all the lines intersect only one county.
gis1930_p <- st_set_crs(gis1930, 4326) %>% st_transform(4326)
st_intersects(points, gis1930_p, sparse=FALSE)
I also plot the lines on top of the counties but only the map of the US counties is mapped.
plot(gis1930_p[0], reset = FALSE)
plot(points[0], add = TRUE)
Any help would be greatly appreciated and please let me know if I can provide any additional details.
You didn't provide your data so I am going to use the dataset provided in: https://gis.stackexchange.com/a/236922/32531
The main thing you need is the st_intersection function:
library(sf)
line_1 <- st_as_sfc("LINESTRING(458.1 768.23, 455.3 413.29, 522.3 325.77, 664.8 282.01, 726.3 121.56)")
poly_1 <- st_as_sfc("MULTIPOLYGON(((402.2 893.03, 664.8 800.65, 611.7 666.13, 368.7 623.99, 215.1 692.06, 402.2 893.03)), ((703.9 366.29, 821.2 244.73, 796.1 25.93, 500.0 137.76, 703.9 366.29)))")
pnts <- st_intersection(line_1,
st_cast(poly_1, "MULTILINESTRING", group_or_split = FALSE))
plot(poly_1)
plot(line_1, add = TRUE)
plot(pnts, add = TRUE, col = "red", pch = 21)

How to create zipcode boundaries in R

I am trying to create a map that has the name of the 'community' showing the boundaries of multiple zip codes. Data that I have is similar to below. Where the variable is the name of the community and the numbers are the corresponding zipcodes.
Tooele <- c('84074','84029')
NEUtahCo <- c('84003', '84004', '84042', '84062')
NWUtahCounty <- c('84005','84013','84043','84045')
I was able to make a map of the entire area I want using
ggmap(get_map(location = c(lon=-111.9, lat= 40.7), zoom = 9))
Attached is a picture of what I want.
You have a decent foundation for this already by having figured out the shapefile and how it matches the zips you want to show. Simple features (sf) make this pretty easy, as does the brand new ggplot2 v3.0.0 which has the geom_sf to plot sf objects.
I wasn't sure if the names of the different areas (counties?) that you have are important, so I just threw them all into little tibbles and bound that into one tibble, utah_zips. tigris also added sf support, so if you set class = "sf", you get an sf object. To keep it simple, I'm just pulling out the columns you need and simplifying one of the names.
library(tidyverse)
library(tigris)
library(ggmap)
Tooele <- c('84074','84029')
NEUtahCo <- c('84003', '84004', '84042', '84062')
NWUtahCounty <- c('84005','84013','84043','84045')
utah_zips <- bind_rows(
tibble(area = "Tooele", zip = Tooele),
tibble(area = "NEUtahCo", zip = NEUtahCo),
tibble(area = "NWUtahCounty", zip = NWUtahCounty)
)
zips_sf <- zctas(cb = T, starts_with = "84", class = "sf") %>%
select(zip = ZCTA5CE10, geometry)
head(zips_sf)
#> Simple feature collection with 6 features and 1 field
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -114.0504 ymin: 37.60461 xmax: -109.0485 ymax: 41.79228
#> epsg (SRID): 4269
#> proj4string: +proj=longlat +datum=NAD83 +no_defs
#> zip geometry
#> 37 84023 MULTIPOLYGON (((-109.5799 4...
#> 270 84631 MULTIPOLYGON (((-112.5315 3...
#> 271 84334 MULTIPOLYGON (((-112.1608 4...
#> 272 84714 MULTIPOLYGON (((-113.93 37....
#> 705 84728 MULTIPOLYGON (((-114.0495 3...
#> 706 84083 MULTIPOLYGON (((-114.0437 4...
Then you can filter the sf for just the zips you need—since there's other information (the county names), you can use a join to get everything in one sf data frame:
utah_sf <- zips_sf %>%
inner_join(utah_zips, by = "zip")
head(utah_sf)
#> Simple feature collection with 6 features and 2 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -113.1234 ymin: 40.21758 xmax: -111.5677 ymax: 40.87196
#> epsg (SRID): 4269
#> proj4string: +proj=longlat +datum=NAD83 +no_defs
#> zip area geometry
#> 1 84029 Tooele MULTIPOLYGON (((-112.6292 4...
#> 2 84003 NEUtahCo MULTIPOLYGON (((-111.8497 4...
#> 3 84074 Tooele MULTIPOLYGON (((-112.4191 4...
#> 4 84004 NEUtahCo MULTIPOLYGON (((-111.8223 4...
#> 5 84062 NEUtahCo MULTIPOLYGON (((-111.7734 4...
#> 6 84013 NWUtahCounty MULTIPOLYGON (((-112.1564 4...
You already have your basemap figured out, and since ggmap makes ggplot objects, you can just add on a geom_sf layer. The tricks are just to make sure you declare the data you're using, set it to not inherit the aes from ggmap, and turn off the graticules in coord_sf.
basemap <- get_map(location = c(lon=-111.9, lat= 40.7), zoom = 9)
ggmap(basemap) +
geom_sf(aes(fill = zip), data = utah_sf, inherit.aes = F, size = 0, alpha = 0.6) +
coord_sf(ndiscr = F) +
theme(legend.position = "none")
You might want to adjust the position of the basemap, since it cuts off one of the zips. One way is to use st_bbox to get the bounding box of utah_sf, then use that to get the basemap.

Resources