In some analyses, it makes sense to use border length as a measure of cultural distance between countries, the idea being that countries that share larger proportions of their borders are more culturally close. This then raises the question of how to compute this. We can grab a shapefile of the world from naturalearthdata.com which covers some 251 units (i.e. they are not all sovereign).
I looked over the methods in the Geocomputation with R ebook website and it seems like an intersection is closest to what we want, i.e. st_intersection(), while st_touches() finds the neighbors without giving any sense of the border length. However, when I try it out on two neighbors, Denmark and Germany, I get no overlap:
> suppressWarnings(library(sf))
Linking to GEOS 3.6.2, GDAL 2.2.3, PROJ 4.9.3; sf_use_s2() is TRUE
> world = read_sf("data/ne_10m_admin_0_countries/ne_10m_admin_0_countries.shp")
> #make valid otherwise get an error
> world = st_make_valid(world)
> #which countries touch each other by border (of the polygons)
> neighbor_ids = st_touches(
+ world$geometry,
+ world$geometry
+ )
> #Denmark Germany
> (germany_idx = which(world$ADMIN=="Germany"))
[1] 50
> (denmark_idx = which(world$ADMIN=="Denmark"))
[1] 71
> world$ADMIN[neighbor_ids[germany_idx][[1]]]
[1] "France" "Czechia" "Luxembourg" "Belgium" "Denmark" "Poland" "Austria" "Switzerland" "Netherlands"
> world$ADMIN[neighbor_ids[denmark_idx][[1]]]
[1] "Germany"
> #border intersection
> #Denmark Germany border as test
> st_intersection(
+ world$geometry[germany_idx],
+ world$geometry[denmark_idx]
+ )
Geometry set for 0 features
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
CRS: 4326
How does one get the border lengths? According to Wikipedia, it should be 68 km.
It seems that what is needed is to tell st_intersection() to include the line at the border. By default, this 1 point overlap is ignored, I guess because it has a 0 area. This functionality is controlled by the ... which forwards to s2_options(). The right parameter is model, which defaults to "open", but should be "closed". Thus:
> #include the line
> st_intersection(
+ world$geometry[germany_idx],
+ world$geometry[denmark_idx],
+ model = "closed"
+ )
Geometry set for 1 feature
Geometry type: MULTILINESTRING
Dimension: XY
Bounding box: xmin: 8.660776 ymin: 54.80162 xmax: 9.437503 ymax: 54.9059
CRS: 4326
MULTILINESTRING ((9.436922 54.81014, 9.422143 5...
To get the length, just add on st_length():
> st_intersection(
+ world$geometry[germany_idx],
+ world$geometry[denmark_idx],
+ mode = "closed"
+ ) %>% st_length()
111563 [m]
Only the result is wrong! The scaling is off by factor of 1.64 or so.
Potential problems:
Is this an issue with the coastline paradox?
Some kind of incorrect setting? I the only setting for st_distance() is the size of the earth, which seems to be set correctly.
Bad shapefile? I downloaded a different one (I forgot the source), and it produced a result of 113469 m, which is slightly different but not remotely close to 68000 as Wikipedia gives.
Is it due to a water border? I plotted the border with tmap, and it looks fine.
This is an interesting problem; I believe the coastline paradox plays a role, but only a minor one. The chief issue seems to be driven by CRS.
Let me illustrate on three examples using the world dataset provided by GISCO (i.e. Eurostat). I like this dataset as it allows several levels of precision.
a rough map in EPSG:3035 (the official CRS for continental EU)
a fine map in EPSG:3035
the same fine map in EPSG:4326 / WGS84
Compare these with the official, i.e. wikipedia length of 68 kilometers.
The rough map is off by about 1/6th, which is to be expected given the low resolution. The fine map is quite close (7% off), and you could expect the actual length to increase yet more, as 1:1M is still a coarse map.
On the other hand the length of the same fine map as in previous example, but projected in WGS84, is off by a factor of two, as you observed.
library(sf)
library(dplyr)
library(giscoR)
# rough line / resolution 1:60 000 000
rough <- gisco_get_countries(resolution = "60",
epsg = 3035,
country = c("DE", "DK"))
plot(st_geometry(rough))
sf::st_intersection(rough[1, ], rough[2, ], model = "closed") %>%
mutate(border_length = st_length(.)) %>%
pull(border_length)
# 53141.03 [m]
# fine line / resolution 1:1 000 000
fine <- gisco_get_countries(resolution = "01",
epsg = 3035,
country = c("DE", "DK"))
plot(st_geometry(fine))
sf::st_intersection(fine[1, ], fine[2, ], model = "closed") %>%
mutate(border_length = st_length(.)) %>%
pull(border_length)
# 63795.4 [m]
# fine line in WGS84
fine_wgs <- gisco_get_countries(resolution = "01",
epsg = 4326,
country = c("DE", "DK"))
sf::st_intersection(fine_wgs[1, ], fine_wgs[2, ], model = "closed") %>%
mutate(border_length = st_length(.)) %>%
pull(border_length)
# 127223 [m]
EDIT (2022-09-12) on second thought this seems to be affected by the behavior of the S2 engine behind {sf} (turning it off via sf_use_s2(FALSE) leads to more reasonable length of borders even for data projected in WGS84).
I will raise it as an issue with {sf} maintainers, as it does not seem likely that this is an expected behaviour.
Related
I am having some issues with st_intersection. I am trying to intersect polygons from bus stop buffers in preparation for areal interpolation. Here are the data: Here is the data: https://realtime.commuterpage.com/rtt/public/utility/gtfs.aspx
Here is my code:
ART2019Path <- file.path(GTFS_path, "2019-10_Arlington.zip")
ART2019GTFS <- read_gtfs(ART2019Path)
ART2019StopLoc <- stops_as_sf(ART2019GTFS$stops) ### Make a spatial file for stops
ART2019Buffer <- st_buffer(ART2019StopLoc, dist = 121.92) ### Make buffer with 400ft (121.92m) radius
It creates something that looks like the image below (created using mapview); as you can see, there are multiple overlapping buffers.
I tried intersecting the polygons using the following:
BufferIntersect <- st_intersection(ART2019Buffer, ART2019Buffer)
BufferIntersect <- st_make_valid(BufferIntersect) ### Fix some of the polygons that didn't quite work
But it only intersects two layers of polygons, meaning there is still overlap. How do I make all buffers intersect?
I have looked at similar questions on here like this: Loop to check multiple polygons overlap in r SF package
But there is no answer.
One of the comments suggested the following links:
https://r-spatial.org/r/2017/12/21/geoms.html
https://r-spatial.github.io/sf/reference/geos_binary_ops.html#details-1
But I can't get either to work. Any help would be greatly appreciated.
Edit
Couple of clarifying points in response to some comments.
I am interested in the area of each unique polygon within bus stop buffers as I will be using these polygons in an areal interpolation with census data to estimate population with access to bus stops
400ft walking distance is standard practice for bus stop accessibility
It sounds like you just want the buffer(s), but without having to deal with all of the overlapping sections. It doesn't matter if a person is within 400ft of one bus-stop or three, right?
If so, you can use the st_union function to "blend" the buffers together.
library(tidytransit)
library(sf)
library(mapview)
library(ggplot2)
# s2 true allows buffering in meters, s2 off later speeds things up
sf::sf_use_s2(TRUE)
ART2019Path <- file.path("/your/file/path/")
ART2019GTFS <- read_gtfs(ART2019Path)
ART2019StopLoc <- stops_as_sf(ART2019GTFS$stops) ### Make a spatial file for stops
ART2019Buffer <- st_buffer(ART2019StopLoc, dist = 121.92) ### Make buffer with 400ft (121.92m) radius
# might be needed due to some strange geometries in buffer, and increase speed
sf::sf_use_s2(FALSE)
#> Spherical geometry (s2) switched off
# MULTIPOLYGON sfc object covering only the buffered areas,
# there are no 'overlaps'.
buff_union <- st_union(st_geometry(ART2019Buffer))
#> although coordinates are longitude/latitude, st_union assumes that they are planar
buff_union
#> Geometry set for 1 feature
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -77.16368 ymin: 38.83828 xmax: -77.04768 ymax: 38.9263
#> Geodetic CRS: WGS 84
#> MULTIPOLYGON (((-77.08604 38.83897, -77.08604 3...
# Non-overlapping buffer & stops
ggplot() +
geom_sf(data = buff_union, fill = 'blue', alpha = .4) +
geom_sf(data = ART2019StopLoc, color = 'black') +
coord_sf(xlim = c(-77.09, -77.07),
ylim = c(38.885, 38.9))
# Overlapping buffer & stops
ggplot() +
geom_sf(data = ART2019Buffer, fill = 'blue', alpha = .4) +
geom_sf(data = ART2019StopLoc, color = 'black') +
coord_sf(xlim = c(-77.09, -77.07),
ylim = c(38.885, 38.9))
# Back to original settings
sf::sf_use_s2(TRUE)
#> Spherical geometry (s2) switched on
Created on 2022-04-18 by the reprex package (v2.0.1)
Something like this works for me, albeit a bit slowly. Here I loop through each stop buffer and run the intersection process on an object containing all other stop buffers excluding that stop buffer.
library(sf)
library(tidyverse)
df<-read.csv("YOUR_PATH/google_transit/stops.txt")
# Read data
ART2019StopLoc <- st_as_sf(df, coords=c('stop_lon', 'stop_lat'))
ART2019StopLoc <- st_set_crs(ART2019StopLoc, value=4326)
# Make buffer
ART2019Buffer <- st_buffer(ART2019StopLoc, dist=121.92)
# Create empty data frame to store results
results <- data.frame()
# Loop through each stop and intersect with other stops
for(i in 1:nrow(ART2019Buffer)) {
# Subset to stop of interest
stop <- ART2019Buffer[i,]
# Subset to all other stops excl. stop of interest
stop_check <- ART2019Buffer[-i,]
# Intersect and make valid
stop_intersect <- st_intersection(stop, stop_check) %>%
st_make_valid()
# Create one intersected polygon
stop_intersect <- st_combine(stop_intersect) %>%
st_as_sf() %>%
mutate(stop_name=stop$stop_name)
# Combine into one results object
results <- rbind(results, stop_intersect)
print(i)
}
ggplot() +
geom_sf(data=ART2019Buffer %>% filter(stop_name %in% results$stop_name),
fill='gray70') +
geom_sf(data=results, aes(fill=stop_name), alpha=0.5)
The plot below shows the results for the first 8 stops. The gray circles are the original stop buffers and the colored buffers show the intersection with adjacent buffers.
I am struggling with gCentroid, because it doesn't seem -- to me -- to give the 'right' answer near a pole of the Earth.
For instance:
library(rgeos)
gCentroid(SpatialPoints(coords=data.frame(longitude=c(-135,-45,45,135),latitute=c(80,80,80,80)),proj4string = CRS('EPSG:4326')))
does not give me the North Pole, it gives:
> SpatialPoints:
> x y
> 1 0 80
> Coordinate Reference System (CRS) arguments: +proj=longlat +datum=WGS84 +no_defs
How do I get gCentroid to work on the surface of the Earth?
The GEOS library is limited to planar geometry operations; this can bring issues in edge cases / the poles being a notorious example.
For the centroid via GEOS to work as intended you need to transform your coordinates from WGS84 to a coordinate reference system appropriate to polar regions; for Arctic regions I suggest EPSG:3995.
library(sp)
library(dplyr)
library(rgeos)
points_sp <- SpatialPoints(coords=data.frame(longitude=c(-135,-45,45,135),latitute=c(80,80,80,80)),proj4string = CRS('EPSG:4326'))
points_updated <- points_sp %>%
spTransform(CRS("EPSG:3995")) # a projected CRS apropriate for Arctic regions
centroid <- gCentroid(points_updated) %>%
spTransform(CRS("EPSG:4326")) # back to safety of WGS84!
centroid # looks better now...
# SpatialPoints:
# x y
# 1 0 90
# Coordinate Reference System (CRS) arguments: +proj=longlat +datum=WGS84 +no_defs
Also note that your workflow - while not wrong in principle - is a bit dated, and the {rgeos} package is approaching its end of life.
It may be good time to give a strong consideration to {sf} package, which is newer, actively developed and can, via interface to s2 library from Google, handle spherical geometry operations.
For an example of {sf} based workflow consider this code; the result (centroid = North Pole) is equivalent to the sp / rgeos one.
library(sf)
points_sf <- points_sp %>% # declared earlier
st_as_sf()
centroid_sf <- points_sf %>%
st_union() %>% # unite featrues / from 4 points >> 1 multipoint
st_centroid()
centroid_sf # the North Pole in a slightly different (sf vs sp) format
# Geometry set for 1 feature
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: 0 ymin: 90 xmax: 0 ymax: 90
# Geodetic CRS: WGS 84 (with axis order normalized for visualization)
# POINT (0 90)
I am cleaning my dataset and I don't know how to clean GPS data.
when I use the table function I find that they are entered in different shapes.
"547140",
"35.6997",
"251825.7959",
"251470.43",
"54/4077070001",
and "54/305495"
I don't know how to clean this variable with this great difference.
I would be thankful if help me or suggest me a website for training.
Your main issue is standardizing the GPS by projecting GPS to a coordinate system of choice. Say we have the GPS of amsterdam in two different coordinate systems, one in amersfoort/rd new (espg 28992) and one in wsg1984 (espg 4326):
x y location espg
1: 1.207330e+05 486632.35593 amsterdam 28992
2: 4.884088e+00 52.36651 amsterdam 4326
structure(list(x = c(120733.012428048, 4.88408811380055), y = c(486632.355933105,
52.3665054922233), location = c("amsterdam", "amsterdam"), espg = c(28992,
4326)), row.names = c(NA, -2L), class = "data.frame")
What we want to do is reproject our coordinates to one geographic coordinate system of choice. In this case I used WSG1984 (espg 4326).
library(sf)
#here I tell R which columns contain the coordinates
coordinates(dt) <- ~x+y
#I now convert the table to a spatial object
dt <- st_as_sf(dt)
#here I split by the different ESPG's present
dt <- split(dt, dt$espg)
#here I loop through every individual espg present in the dataset
for(i in 1:length(dt)){
#here I say in which coordinate system (espg) the GPS data is in
st_crs(dt[[i]]) <- unique(dt[[i]]$espg)
#here I transform the coordinates to another projection (in this case WSG1984, espg 4326)
dt[[i]] <- dt[[i]] %>% st_transform(4326)
}
#here I bind the items of the list together
dt <- do.call(rbind, dt)
head(dt)
Simple feature collection with 2 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 4.884088 ymin: 52.36651 xmax: 4.884088 ymax: 52.36651
Geodetic CRS: WGS 84
location espg geometry
4326 amsterdam 4326 POINT (4.884088 52.36651)
28992 amsterdam 28992 POINT (4.884088 52.36651)
In the geometry column you now see that the coordinates are equal to one another.
Bottom line is that you need to know the geographic coordinate system the GPS data is in. Then you can convert your data from a table to a spatial object and transform the GPS data to a projection of choice.
In addition, it is always a good idea to check if your assumption on the original ESPG is good by for example plotting the data.
library(ggplot2)
library(ggspatial)
ggplot(dt) + annotation_mape_tile() + geom_sf(size = 4) + theme(text = element_text(size = 15) + facet_wrap(~espg)
In the figurebelow we see that the projection went well for both espg's.
Question: Polygons that cross the international dateline frequently have a North-South line through them. Eastern Russia in the rnaturalearth package is a good example of this, but I have also encountered it with other spatial data. I would like to be able to remove this line for plotting.
Attempts:
I primarily use the sf package in R for mapping. I have tried various solutions involving st_union, st_combine, st_wrap_dateline, st_remove_holes, as well as using functions from other packages such as aggregate, merge, and gUnaryUnion, but my efforts have been fruitless so far.
Example: The following code demonstrates the problem lines in Russia along the international dateline using the popular rnaturalearth package.
library(tidyverse)
library(rnaturalearth)
library(sf)
#Import data
world <- ne_countries(scale = "medium",
returnclass = "sf")
#I use the Alaska albers projection for this map,
#limit extent (https://spatialreference.org/ref/epsg/nad83-alaska-albers/)
xmin <- -2255938
xmax <- 1646517
ymin <- 449981
ymax <- 2676986
#plot
ggplot()+
geom_sf(data=world, color="black", size=1)+
coord_sf(crs=3338)+
xlim(c(xmin,xmax))+ylim(c(ymin,ymax))+
theme_bw()
Thanks!
Short answer
EPSG:3338 is the problem - use a UTM (326XX or 327XX) code instead.
Long answer
My gut feeling is this is related to the challenges of projecting geographic (long-lat) data to a flat surface - either a projected CRS, or more simply the flat surface of the plot viewer pane in RStudio.
We know that on a ellipsoidal model of Earth, the (minimum) on-ground distance between longitudes of -179 and +179 is the same as the distance between -1 and +1, a distance of 2 degrees. However from a numerical perspective, these two lines of longitude have a distance of 358 degrees between them.
Imagine you are an alien (or a flat-earther) and looking at the following projection of world, and you didn't know that Earth was ellipsoidal in shape (or you didn't know this was a projection). You would be forgiven for thinking that to get from one part of Russia (red) to the other, you would have to get wet. I guess by default, ggplot is a flat-earther.
Imagine each polygon in the above plot is a piece of a jigsaw. In your plot, I guess you are setting the origin to the centre of EPSG:3338 (coord_sf(crs = 3338)), which I think is somewhere in Alaska/Canada? (I'm guessing here as I don't use this notation, rather I prefer to transform data before sending to ggplot). Regardless, ggplot knows it should rearrange it's 'puzzle pieces', so longitude -179 and +179 are next to each other - but this is purely visual, as in your plot:
So, my guess is that when you try and use st_union() or st_simplify(), the polygons aren't actually next to each other in space so are not joined. This is where a projected CRS should solve the problem, transforming the coords to values relative to an origin other than (long 0, lat 0).
This I think is one source of trouble for you - a quick google of EPSG:3338 says it is good for Alaska, but no mention of Russia. The first thing that came up when I googled 'utm russia' was EPSG:32635. So, let's take a look at the values for longitude for EPSG codes 4326 (WGS84 longlat), 3338 (NAD83 Alaska) and 32635.
# pull out russia
world %>%
filter(
str_detect(name_long, 'Russia')
) %>%
select(name_long, geometry) %>%
{. ->> russia}
# extract coords of each projection
russia %>%
st_transform(3338) %>%
{. ->> russia_3338} %>%
st_coordinates %>%
as_tibble %>%
select(X) %>%
mutate(
crs = 'utm_3338'
) %>%
{. ->> russia_coords_3338}
russia %>%
st_transform(4326) %>%
{. ->> russia_4326} %>%
st_coordinates %>%
as_tibble %>%
select(X) %>%
mutate(
crs = 'utm_4326'
) %>%
{. ->> russia_coords_4326}
russia %>%
st_transform(32635) %>%
{. ->> russia_32635} %>%
st_coordinates %>%
as_tibble %>%
select(X) %>%
mutate(
crs = 'utm_32635'
) %>%
{. ->> russia_coords_32635}
Let's combine them and look at a histogram of longitude values
# inspect X coords on a histogram
bind_rows(
russia_coords_3338,
russia_coords_4326,
russia_coords_32635,
) %>%
ggplot(aes(X))+
geom_histogram()+
facet_wrap(~crs, ncol = 1, scales = 'free')
So, as you can see projections 4326 and 3338 have 2 distinct groups of coords at either ends of the earth, with a big break (spanning x = 0) in between. Projection 32635 though, has only one group of coords, suggesting that the 2 parts of Russia, according to this projection, are numerically positioned next to each other. Projection 32635 works because it transforms the coords into '(minimum?) distance from an origin'; the origin of which (unlike long-lat coords) is not on the opposite side of the world and doesn't need to go 2 different directions around the globe to determine minimum distance to either end of the country (this is what causes the break in longitude coords for the other 2 projections). I don't know enough about EPSG:3338 to explain why it does this too, but suspect it's because it is Alaska-focused so they didn't consider crossing the 180th meridian.
If we plot russia_32635 we can see these pieces are next to each other, but remember we don't trust ggplot just yet. When we use st_simplify() this date line (red) disappears, proving that the 2 polygons are next to each other and can be simplified/unioned.
ggplot()+
geom_sf(data = russia_32635, colour = 'red')+
geom_sf(data = russia_32635 %>% st_simplify, fill = NA)
st_simplify() has dissolved the 2 boundaries on the date line, reducing our number of individual polygons from 100 to 98.
russia_32635 %>%
st_cast('POLYGON')
# Simple feature collection with 100 features and 1 field
# Geometry type: POLYGON
# Dimension: XY
# Bounding box: xmin: 21006.08 ymin: 4772449 xmax: 6273473 ymax: 13233690
# Projected CRS: WGS 84 / UTM zone 35N
russia_32635 %>%
st_simplify %>%
st_cast('POLYGON')
# Simple feature collection with 98 features and 1 field
# Geometry type: POLYGON
# Dimension: XY
# Bounding box: xmin: 21006.08 ymin: 4772449 xmax: 6273473 ymax: 13233690
# Projected CRS: WGS 84 / UTM zone 35N
Alternatively, it looks like st_union(..., by_feature = TRUE) also works - see ?st_union:
If by_feature is TRUE each feature geometry is unioned. This can for instance be used to resolve internal boundaries after polygons were combined using st_combine.
russia_32635 %>%
st_union(by_feature = TRUE) %>%
st_cast('POLYGON')
# Simple feature collection with 98 features and 1 field
# Geometry type: POLYGON
# Dimension: XY
# Bounding box: xmin: 21006.08 ymin: 4772449 xmax: 6273473 ymax: 13233690
# Projected CRS: WGS 84 / UTM zone 35N
So, technically there is your plot of Russia without the date line. I think Russia is tricky to plot because a) it's close to the poles, and b) it covers such a vast area meaning most projections are going to skew from one end of the country to another.
However to me, it makes sense to orient the plot 'north-up'. A way to do this is to make your own 'Mollweide' projection and assign the origin to the approximate centre of Russia (lon 99, lat 65). Without st_buffer(0), this plots with the date line for some reason (see here and here for examples, and section 6.5 here for explanation).
my_proj <- '+proj=moll +lon_0=99 +lat_0=65 +units=m'
russia_32635 %>%
st_buffer(0) %>%
st_transform(crs(my_proj)) %>%
st_simplify %>%
ggplot()+
geom_sf()
Bonus
I tried plotting russia_32635 %>% st_simplify with tmap and leaflet, but did not get desired results. I assume this is because these packages prefer geographic (lon-lat) coords; leaflet only accepts longlat format as far as I can tell, and although tmap can certainly handle projected data, my guess is that under the bonnet it transforms it (or similar) to it's preferred projection. Workarounds look to be available at the same links as above if you really really want this visualisaiton (here, here and here).
library(tmap)
russia_32635 %>%
st_simplify %>%
tm_shape()+
tm_polygons()
library(leaflet)
russia_32635 %>%
st_simplify %>%
st_transform(4326) %>% # because leaflet only works with longlat projections
leaflet %>%
addTiles %>%
addPolygons()
Ultimately, you can only preserve 2/3 primary characteristics when projecting data: area, direction or distance. This is made even more obvious when projecting something as big and polar as Russia. Hopefully one of these options is suitable for your problem.
I feel like I made significant progress, so I'm posting, but this isn't a complete answer.
# This is the portion containing the international dateline
df <- world[184, ]
# Split MULTIPOLYGON into individuals
df2 <- st_cast(df, "POLYGON")
# The little blob at the top is in df2[36, ] and df[38, ]
# Simplify it with the right tolerance and the line is gone
ggplot()+
geom_sf(data=st_simplify(st_union(df2[36, ], df2[38, ]), dTolerance = 2), color="black", size=1)+
coord_sf(crs=3338)+
xlim(c(xmin,xmax))+ylim(c(ymin,ymax))+
theme_bw()
Result:
Another solution is to use ms_dissolve() from rmapshaper package
chukotka %>%
st_transform(32660) %>%
rmapshaper::ms_dissolve() %>%
ggplot()+
geom_sf()
Rookie R user here and I would greatly appreciate any help you someone could give me.
My project requires me to create a vector boundary box around a city of my choice and then filter a lot of data so I only have the data relative to the area. However, it is several years since I have used R studio and its fair to say I remember little to nothing about the language.
I have initially used
geocode("Hereford, UK")
bbox <-c(Longitude=-2.72,Latitude=52.1)
myMap <- get_map(location = "Hereford, UK",source="google",maptype="roadmap")
I then must create a new tibble which filters out and gives only the relevant data to the area.
I am unsure how to proceed with this and I then must overlay the data onto the map which I have created.
As I only have a centre point of coordinates, is it possible to create a circle with a radius of say 3 miles around the centre of my location so I can then filter this area?
Thank you all for taking the time to read my post. Cheers!
Most spatial work can now be done pretty easily using the sf package.
Example code for a similar problem is below. The comments explain most of what it does.
The difficult part may be in understanding map projections (the crs). Some use units(meters, feet, etc) and others use latitude / longitude. Which one you choose depends on what area of the globe you're working with and what you're trying to accomplish. Most web mapping uses crs 4326, but that does not include an easily usable distance measurement.
The map below shows points outside ~3 miles from Hereford as red, and those inside in dark maroon. The blue point is used as the center for Hereford & the buffer zone.
library(tidyverse)
library(sf)
#> Linking to GEOS 3.6.2, GDAL 2.2.3, PROJ 4.9.3
library(mapview)
set.seed(4)
#hereford approx location, ggmap requires api key
hereford <- data.frame(place = 'hereford', lat = -2.7160, lon = 52.0564) %>%
st_as_sf(coords = c('lat', 'lon')) %>% st_set_crs(4326)
#simulation of data points near-ish hereford
random_points <- data.frame(point_num = 1:20,
lat = runif(20, min = -2.8, max = -2.6),
lon = runif(20, min = 52, max = 52.1)) %>%
st_as_sf(coords = c('lat', 'lon')) %>% st_set_crs(4326) %>%st_transform(27700)
#make a buffer of ~3miles (4800m) around hereford
h_buffer <- hereford %>% st_transform(27700) %>% #change crs to one measured in meters
st_buffer(4800)
#only points inside ~3mi buffer
points_within <- random_points[st_within( random_points, h_buffer, sparse = F), ]
head(points_within)
#> Simple feature collection with 6 features and 1 field
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 346243.2 ymin: 239070.3 xmax: 355169.8 ymax: 243011.4
#> CRS: EPSG:27700
#> point_num geometry
#> 1 1 POINT (353293.1 241673.9)
#> 3 3 POINT (349265.8 239397)
#> 4 4 POINT (349039.5 239217.7)
#> 6 6 POINT (348846.1 243011.4)
#> 7 7 POINT (355169.8 239070.3)
#> 10 10 POINT (346243.2 239690.3)
#shown in mapview
mapview(hereford, color = 'blue') +
mapview(random_points, color = 'red', legend = F, col.regions = 'red') +
mapview(h_buffer, legend = F) +
mapview(points_within, color = 'black', legend = F, col.regions = 'black')
Created on 2020-04-12 by the reprex package (v0.3.0)