How to create zipcode boundaries in R - r

I am trying to create a map that has the name of the 'community' showing the boundaries of multiple zip codes. Data that I have is similar to below. Where the variable is the name of the community and the numbers are the corresponding zipcodes.
Tooele <- c('84074','84029')
NEUtahCo <- c('84003', '84004', '84042', '84062')
NWUtahCounty <- c('84005','84013','84043','84045')
I was able to make a map of the entire area I want using
ggmap(get_map(location = c(lon=-111.9, lat= 40.7), zoom = 9))
Attached is a picture of what I want.

You have a decent foundation for this already by having figured out the shapefile and how it matches the zips you want to show. Simple features (sf) make this pretty easy, as does the brand new ggplot2 v3.0.0 which has the geom_sf to plot sf objects.
I wasn't sure if the names of the different areas (counties?) that you have are important, so I just threw them all into little tibbles and bound that into one tibble, utah_zips. tigris also added sf support, so if you set class = "sf", you get an sf object. To keep it simple, I'm just pulling out the columns you need and simplifying one of the names.
library(tidyverse)
library(tigris)
library(ggmap)
Tooele <- c('84074','84029')
NEUtahCo <- c('84003', '84004', '84042', '84062')
NWUtahCounty <- c('84005','84013','84043','84045')
utah_zips <- bind_rows(
tibble(area = "Tooele", zip = Tooele),
tibble(area = "NEUtahCo", zip = NEUtahCo),
tibble(area = "NWUtahCounty", zip = NWUtahCounty)
)
zips_sf <- zctas(cb = T, starts_with = "84", class = "sf") %>%
select(zip = ZCTA5CE10, geometry)
head(zips_sf)
#> Simple feature collection with 6 features and 1 field
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -114.0504 ymin: 37.60461 xmax: -109.0485 ymax: 41.79228
#> epsg (SRID): 4269
#> proj4string: +proj=longlat +datum=NAD83 +no_defs
#> zip geometry
#> 37 84023 MULTIPOLYGON (((-109.5799 4...
#> 270 84631 MULTIPOLYGON (((-112.5315 3...
#> 271 84334 MULTIPOLYGON (((-112.1608 4...
#> 272 84714 MULTIPOLYGON (((-113.93 37....
#> 705 84728 MULTIPOLYGON (((-114.0495 3...
#> 706 84083 MULTIPOLYGON (((-114.0437 4...
Then you can filter the sf for just the zips you need—since there's other information (the county names), you can use a join to get everything in one sf data frame:
utah_sf <- zips_sf %>%
inner_join(utah_zips, by = "zip")
head(utah_sf)
#> Simple feature collection with 6 features and 2 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -113.1234 ymin: 40.21758 xmax: -111.5677 ymax: 40.87196
#> epsg (SRID): 4269
#> proj4string: +proj=longlat +datum=NAD83 +no_defs
#> zip area geometry
#> 1 84029 Tooele MULTIPOLYGON (((-112.6292 4...
#> 2 84003 NEUtahCo MULTIPOLYGON (((-111.8497 4...
#> 3 84074 Tooele MULTIPOLYGON (((-112.4191 4...
#> 4 84004 NEUtahCo MULTIPOLYGON (((-111.8223 4...
#> 5 84062 NEUtahCo MULTIPOLYGON (((-111.7734 4...
#> 6 84013 NWUtahCounty MULTIPOLYGON (((-112.1564 4...
You already have your basemap figured out, and since ggmap makes ggplot objects, you can just add on a geom_sf layer. The tricks are just to make sure you declare the data you're using, set it to not inherit the aes from ggmap, and turn off the graticules in coord_sf.
basemap <- get_map(location = c(lon=-111.9, lat= 40.7), zoom = 9)
ggmap(basemap) +
geom_sf(aes(fill = zip), data = utah_sf, inherit.aes = F, size = 0, alpha = 0.6) +
coord_sf(ndiscr = F) +
theme(legend.position = "none")
You might want to adjust the position of the basemap, since it cuts off one of the zips. One way is to use st_bbox to get the bounding box of utah_sf, then use that to get the basemap.

Related

How to build Voronoi Polygons for point coordinates in R?

I have various points (2000+) for observation stations in the alps. I would like to use them to represent the closest geographic area, that is not closer to another observation station. I have done some research, and think that using Varanoi polygons may be the best way to do this.
After having attempting to build these in R, the polygon plot does not quite match my graphing in R.
I have attached the sample data points I am experimenting with, as well as the two images that show the dissimilar graphing of the points.
What do I need to be do differently to make sure that the points line up?
Points:
Longitude:
15.976667 12.846389 14.457222 13.795556 9.849167 16.055278 13.950833 15.666111 9.654722 15.596389 13.226667 15.106667 13.760000 12.226111 9.612222 17.025278 9.877500 15.368056 13.423056 12.571111 16.842222 13.711667 14.003056 12.308056 13.536389
Latitude:
48.40167 48.14889 47.56778 46.72750 47.45833 48.04472 47.82389 47.49472 47.35917 48.64917 48.25000 48.87139 47.87444 47.42806 47.20833 47.77556 47.40389 47.87583 47.53750 46.77694 47.74250 46.55000 48.37611 47.38333 47.91833
Pictures:
Map of the 25 sample points in Leaflet:
Voronoi plot:
Clearly these two are not the same images, so I must be doing something wrong. Here's the code I'm using to generate the Voronoi plot and the leaflet map.
meta25%>%
st_as_sf(coords = c("Longitude", "Latitude"),
crs = sp::CRS("+proj=longlat +datum=WGS84")) %>%
mapview()
m1 = matrix(meta25$Longitude,meta25$Latitude,ncol=2,nrow=25) %>% st_multipoint()
voronoi_grid <- st_voronoi(m1)
plot(voronoi_grid, col = NA)
plot(m1, add = TRUE, col = "blue", pch = 16)
I'm not sure what the problem is, but the matrix is not necessary. Stick to sf objects and you should be fine.
library(tidyverse)
library(sf)
# create pts from lat & lon data
pts <- tibble(latitude = y, longitude = x) %>%
st_as_sf(coords = c('latitude', 'longitude')) %>%
st_set_crs(4326)
# voronoi of pts
vor <- st_voronoi(st_combine(pts))
head(vor)
#> Geometry set for 1 feature
#> Geometry type: GEOMETRYCOLLECTION
#> Dimension: XY
#> Bounding box: xmin: 2.199166 ymin: 39.13694 xmax: 24.43833 ymax: 56.28445
#> Geodetic CRS: WGS 84
#> GEOMETRYCOLLECTION (POLYGON ((2.199166 49.37841...
# st_voronoi returns a GEOMETRYCOLLECTION,
# some plotting methods can't use a GEOMETRYCOLLECTION.
# this returns polygons instead
vor_poly <- st_collection_extract(vor)
head(vor_poly)
#> Geometry set for 6 features
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 2.199166 ymin: 39.13694 xmax: 18.32787 ymax: 56.28445
#> Geodetic CRS: WGS 84
#> First 5 geometries:
#> POLYGON ((2.199166 49.37841, 2.199166 56.28445,...
#> POLYGON ((9.946349 39.13694, 2.199166 39.13694,...
#> POLYGON ((18.32787 39.13694, 11.64381 39.13694,...
#> POLYGON ((9.794868 47.23828, 9.766296 47.38061,...
#> POLYGON ((5.225657 56.28445, 9.393793 56.28445,...
plot(pts, col = 'blue', pch = 16)
plot(vor_poly, add = T, fill = NA)
Created on 2021-04-05 by the reprex package (v0.3.0)
Thanks everyone for your help, not sure if it got quite to where I was looking for. I've since adapted the answer from here: Creating bordering polygons from spatial point data for plotting in leaflet

How can I bin data into hexagons of a shapefile and plot it?

I am new to r and also to this website. I ran into some trouble with my current distribution project. My goal is to create a map with hexagons that have a colour gradient based on different attributes. For example number of records, number of species, rarefaction, etc. in the hexagon. I started with two shapefiles.
One for the hexagons:
Simple feature collection with 10242 features and 4 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -180 ymin: -90 xmax: 180 ymax: 90
CRS: 4326
First 10 features:
ID CENTRELAT CENTRELON AREA geometry
1 -43.06618 41.95708 41583.14 MULTIPOLYGON (((43.50039 -4...
2 -73.41802 -144.73583 41836.20 MULTIPOLYGON (((-147.695 -7...
4862 -82.71189 -73.45815 50247.96 MULTIPOLYGON (((-78.89901 -...
7162 88.01938 53.07438 50258.17 MULTIPOLYGON (((36.63494 87...
3 -75.32015 -145.44626 50215.61 MULTIPOLYGON (((-148.815 -7...
4 -77.21239 -146.36437 50225.85 MULTIPOLYGON (((-150.2982 -...
5 -79.11698 -147.60550 50234.84 MULTIPOLYGON (((-152.3518 -...
6 -81.03039 -149.37750 50242.49 MULTIPOLYGON (((-155.3729 -...
7 -82.94618 -152.11105 50248.70 MULTIPOLYGON (((-160.2168 -...
8 -84.84996 -156.85274 50253.03 MULTIPOLYGON (((-169.0374 -...
And one for the map: geometry type: POLYGON; dimension: XY; bbox: xmin: -180 ymin: -90 xmax: 180 ymax: 83.64513; CRS: 4326
It is the land shapefile from this link:
natural earth data
I loaded them with the st_read function. And created a map with this code:
ggplot() +
geom_sf(data = hex5) +
geom_sf(data = land) +
coord_sf(1, xlim = c(100, 180), ylim = c(0, 90))
The map
I have a data frame that contains species names, longitude and latitude. Roughly 6300 entries.
scientific lat lon
1 Acoetes melanonota 11.75690 124.8010
2 Acoetes melanonota 11.97500 102.7350
3 Acoetes melanonota 13.33000 100.9200
4 Acrocirrus muroranensis 42.31400 140.9670
5 Acrocirrus uchidai 43.04800 144.8560
6 Acrocirrus validus 35.30000 139.4830
7 Acutomunna minuta 29.84047 130.9178
8 Admetella longipedata 13.35830 120.5090
9 Admetella longipedata 13.60310 120.7570
10 Aega acuticauda 11.95750 124.1780
How can I bin this data into the hexagons of the map and colour them with a gradient?
Thank you very much!
As I understand it, you have some points and some polygons. You want to summarise the values of the points by the polygon they are in. I made a reproducible example of a possible solution:
library(sf)
library(data.table)
library(dplyr)
# Create an exagonal grid
sfc = sf::st_sfc(sf::st_polygon(list(rbind(c(0,0), c(1,0), c(1,1), c(0,0)))))
G = sf::st_make_grid(sfc, cellsize = .1, square = FALSE)
# Convert to sf object
G = sf::st_as_sf(data.table(id_hex=1:76, geom=sf::st_as_text(G)), wkt='geom')
# Create random points on the grid with random value
n=500
p = data.table(id_point=1:n,
value = rnorm(n),
x=sample(seq(0,1,0.01), n, replace=T),
y=sample(seq(0,1,0.01), n, replace=T)
)
p = p[x >= y]
P = sf::st_as_sf(p, coords=c('x', 'y'))
# Plot geometry
plot(sf::st_geometry(G))
plot(P, add=TRUE)
# Join the geometries to associate each polygon to the points it contains
# Group by and summarise
J = sf::st_join(G, P, join=sf::st_contains) %>%
dplyr::group_by(id_hex) %>%
dplyr::summarise(sum_value=sum(value, na.rm=F),
count_value=length(value),
mean_value=mean(value, na.rm=F))
plot(J)
# Plot interactive map with mapview package
mapview::mapview(J, zcol="count_value") +
mapview::mapview(P)
Created on 2020-04-25 by the reprex package (v0.3.0)

Create new geometry on grouped column in R sf

I'd like to create a new shapefile or a new geometry variable that allows me to plot borders around regions in R. I'm using the sf and mapping with tmap. Basically, I'm adding a character vector to an sf object and would like to make the character vector the new/preferred mapping border.
Here is an example of my approach, which doesn't do what I'd like. I can't tell that it does anything.
library(tidyverse)
library(sf)
library(tmap)
## use North Carolina example
nc = st_read(system.file("shape/nc.shp", package="sf"))
nc_new.region <- nc %>% ## add new region variable
mutate(new.region = sample(c('A', 'B', 'C'), nrow(.),replace = T))
nc_union <- nc_new.region %>%
group_by(new.region) %>% # group by the new character vector
mutate(new_geometry = st_union(geometry)) # union on the geometry variable
# map with tmap package
tm_shape(nc_union)+
tm_borders()
This happens because mutate(new_geometry = st_union(geometry)) creates a "new" column within the original sf object, but plotting still uses the "original" geometry column. Indeed, if you have a look at your nc_union object, you'll see that it still contains 100 features (therefore, no "dissolving" was really done).
To do what you wish, you should instead create a "new" sf object using summarize over the groups:
library(tidyverse)
library(sf)
library(tmap)
## use North Carolina example
nc = st_read(system.file("shape/nc.shp", package="sf"))
#> Reading layer `nc' from data source `D:\Documents\R\win-library\3.5\sf\shape\nc.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 100 features and 14 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> epsg (SRID): 4267
#> proj4string: +proj=longlat +datum=NAD27 +no_defs
nc_new.region <- nc %>% ## add new region variable
mutate(new.region = sample(c('A', 'B', 'C'), nrow(.),replace = T))
nc_union <- nc_new.region %>%
group_by(new.region) %>%
summarize()
> nc_union
Simple feature collection with 3 features and 1 field
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
epsg (SRID): 4267
proj4string: +proj=longlat +datum=NAD27 +no_defs
# A tibble: 3 x 2
new.region geometry
<chr> <MULTIPOLYGON [°]>
1 A (((-78.65572 33.94867, -79.0745 34.30457, -79.04095 34.3193, -79.02947 34.34737, -7~
2 B (((-79.45597 34.63409, -79.6675 34.80066, -79.68596 34.80526, -79.66015 34.8179, -7~
3 C (((-78.65572 33.94867, -78.63472 33.97798, -78.63027 34.0102, -78.58778 34.03061, -~
tm_shape(nc_union)+
tm_borders()
You can see that now nc_union contains only 3 MULTIPOLYGONS, and plot reflects the "aggregation".
See also: https://github.com/r-spatial/sf/issues/290
Created on 2019-08-23 by the reprex package (v0.3.0)

R googleway with ABS census data

I'm intending to analyse Australian census data using googleway to produce heat maps.
My approach has been to use prepared data from googleway melbourne which contains column SA2_NAME and join it with the ESRI shape file from the census data after conversion with rgdal (code below). The problem is that joining by SA2_NAME is not unique for polylines - some SA2 areas are made of multiple 'sub' areas. So it seems this is not a good approach.
A better approach would be to convert the ESRI shape data sa2_shape below to have polylines in the format of the melbourne data. How is this done?
Code below produces a 'bridging' data frame to use in joining melbourne data from googleway with ABS data which has SA2_MAIN as the key field - as stated above, the problem with this 'hack' approach is that polylines are not unique by SA2_NAME
library(tidyverse)
library(googleway)
library(rgdal)
shape_path <- "abs_data/sa2_esri_shapefile"
shape_file <- "SA2_2016_AUST"
sa2_shape <- readOGR(shape_path, shape_file)
sa2_df <- data.frame(sa2_shape$SA2_MAIN, sa2_shape$SA2_NAME)
names(sa2_df) <- c("SA2_MAIN", "SA2_NAME")
sa2_df <- sa2_df %>% semi_join(melbourne, by = "SA2_NAME")
As per SymbolixAU comment - used sf to load the data and this works as long as geometry is not an empty list - see code below.
library(tidyverse)
library(googleway)
library(sf)
shape_path <- "abs_data/sa2_esri_shapefile"
shape_file <- "SA2_2016_AUST"
shape_file_path <- paste0(shape_path, "/", shape_file, '.shp')
sa2_shape <- sf::st_read(shape_file_path)
sa2_shape <- sa2_shape %>%
filter(STATE_NAME == "Victoria",
AREA_SQKM > 0)# This is important - otherwise google_map() will crash!
google_map() %>%
googleway::add_polygons(data = sa2_shape,
polyline = "geometry",
fill_colour = "SA2_NAME")
> sa2_shape %>% head()
Simple feature collection with 6 features and 6 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: 143.6849 ymin: -37.68153 xmax: 143.951 ymax: -37.46847
epsg (SRID): 4283
proj4string: +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
SA2_MAIN SA2_MAIN16 SA2_NAME STATE_CODE STATE_NAME AREA_SQKM geometry
1 201011001 201011001 Alfredton 2 Victoria 52.7111 MULTIPOLYGON (((143.7072 -3...
2 201011002 201011002 Ballarat 2 Victoria 12.3787 MULTIPOLYGON (((143.8675 -3...
3 201011003 201011003 Ballarat - North 2 Victoria 92.3577 MULTIPOLYGON (((143.853 -37...
4 201011004 201011004 Ballarat - South 2 Victoria 32.8541 MULTIPOLYGON (((143.8675 -3...
5 201011005 201011005 Buninyong 2 Victoria 51.5855 MULTIPOLYGON (((143.8533 -3...
6 201011006 201011006 Delacombe 2 Victoria 34.1608 MULTIPOLYGON (((143.7072 -3...

Subset class sfc_LINESTRING & sfc objects within a bbox

Example:
bbox <- c(-0.1178, 51.4232, -0.0185, 51.5147) # I know it needs to be sf df object
# we have
df
#> Geometry set for 300 features
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: -0.113894 ymin: 51.49739 xmax: -0.0764779 ymax: 51.59839
#> epsg (SRID): 4326
#> proj4string: +proj=longlat +datum=WGS84 +no_defs
#> LINESTRING (-0.113894 51.50631, -0.1135137 51.5...
#> LINESTRING (-0.0767875 51.59837, -0.0764779 51....
#> ....
How can I do something like
df[bbox]
and keep the linestrings which are within the bbox. Thanks.
Here's an example using an sf object from tigris, just for reproducibility. I'm using towns in New Haven County, Connecticut, plotting it the way it comes in. Then I crop it to a bounding box I made up, using st_crop, which I believe was added fairly recently to sf. If I had the bbox as a shape, instead of a vector of coordinates, I could have used st_intersection.
I don't have a linestring object handy, but I'd assume it works the same way.
library(tidyverse)
library(sf)
# selecting just to limit the amount of data in my sf
ct_sf <- tigris::county_subdivisions(state = "09", county = "09", cb = T, class = "sf") %>%
select(NAME, geometry)
plot(ct_sf)
crop_bbox <- c(xmin = -73, ymin = 41.2, xmax = -72.7, ymax = 41.5)
ct_cropped <- st_crop(ct_sf, crop_bbox)
plot(ct_cropped)

Resources