Neighboring Zipcodes from State Polygons - r

First time posting on SO
I have a shapefile that has the geometries for each Zipcode along with state name. I want to figure out which zipcodes lie on the state borders.
The way I figured to achieve this is by combining all zipcodes for each state and leading to the geometry for a state and then finding the neighboring zipcodes for each state.
I combined the zipcodes into states using:
state_shape <- shapefile %>% group_by(State) %>% summarise(geometry = sf::st_union(geometry))
But then when I try to find the neighboring zipcodes using poly2nb
state_nb <- poly2nb(st_geometry(state_shape))
It gives me an Error:
Error in poly2nb(st_geometry(state_shape)) : Polygon geometries required
I understand to find the border zipcodes I will have to pass the zipcode geometries in poly2nb, but the error persists.
Any help will be highly appreciated, also any other approaches to this problem are more than welcome.

Consider this example, built on the widely available North Carolina shapefile that is distributed with {sf} package.
What the example does is:
creates a border line of North Carolina by first dissolving the counties, and then casting the resulting multipolygon to a multilinestring
runs sf::st_touches() on the counties and borderline with sparse set to false; the result is a logical vector that can be used to subset the original shapefile (filtering out the counties that share a border with the NC border)
presents the results in a graphical format, using {ggplot2}; the bordering counties are blue and the rest just blank for context
library(sf)
library(dplyr)
library(ggplot2)
# all NC counties (from shapefile distributed with {sf})
shape <- st_read(system.file("shape/nc.shp", package="sf"))
# border, via dplyr::summarise() & cast as a linestring
border <- shape %>%
summarise() %>%
st_cast("MULTILINESTRING")
# logical vector of length nrow(shape)
neighbours <- sf::st_touches(shape,
border,
sparse = F)
# report results
ggplot() +
geom_sf(data = shape[neighbours, ], fill = "blue") + # border counties
geom_sf(data = shape, fill = NA, color = "grey45") # all counties for context

Related

Merge adjacent polygons in sf

I'm trying to merge groups of adjacent polygons, but I'm getting big multipolygons with non-adjacent areas. In the code block below plot(Matsuyama.sf) shows a large contiguous region and a few islands, but I can't extract those geometries. How do I get those areas and geometries.
library(sf)
library(tidyverse)
Matsuyama.sf <- st_read("https://geoshape.ex.nii.ac.jp/city/geojson/20210101/38/38201A1968.geojson")
Matsuyama.sf <- st_transform(Matsuyama.sf, crs=4326)
plot(Matsuyama.sf)
st_area(Matsuyama.sf)
I can split into hundreds of polygons, but the code options below just lump them back together into one
split.sf <- st_cast(Matsuyama.sf, "POLYGON")
clumps_1.sf <- st_join(split.sf, split.sf, join = st_intersects)
clumps_2sf <- Matsuyama.sf %>% mutate(INTERSECT = st_intersects(.))
What am I missing?

Why am I unable to build data frame with census block and lat/long outline?

I am new to R and working with census data and I am trying to build a csv file that can be passed to another team that shows the lat/long outline of census blocks.
With this I can get to 3 specific census blocks in Florida.
FL_blocks <- blocks("FL", year = 2010, )
FL_blocks_Alachua <- filter(FL_blocks, COUNTYFP == "001")
NTIA_FL_CB <- FL_blocks_Alachua %>%
filter(FL_blocks_Alachua$GEOID10 %in% c ("120010019071008", "120010019071007", "120010019071009"))
Now I would like to make a table that just shows the lat/long and the corresponding census block that lat/long is associated with. This will give me the list of the lat/long for each census block and I can plot them with the polygon of the census block, but for me to pass this to a team unfamiliar with R, I need the csv data output.
NTIA_FL_CB_Shape_xy <- as.data.frame(st_coordinates(NTIA_FL_CB$geometry))
NTIA_FL_CB_Shape_xy <- NTIA_FL_CB_Shape_xy %>%
rename( Longitude = X, Latitude = Y) %>%
select(Latitude,Longitude )
# save lat/long as csv
st_write(NTIA_FL_CB_Shape_xy, "fl_3cb_ntia_latlong.csv", coords = TRUE)
# plot the 3 census blocks with the outline of the shapefiles marked as green circles
leaflet(NTIA_FL_CB) %>%
addTiles() %>%
addPolygons(popup = ~GEOID10) %>%
addCircleMarkers(data = NTIA_FL_CB_Shape_xy, color = "green")
This image shows what I was thinking, If I could pass the lat/longs outlining each census block the team that uses my csv file output should be able to load that into their GIS software to overlay.
The trick is to cast to point, then export. Or, if your team is using GIS software, you could write your multipolygon to a shapefile?
library(sf)
nc <- st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
nc_pt <- st_cast(nc, "POINT")
st_write(nc_pt, "nc_pt.csv", layer_options="GEOMETRY=AS_XY")
# st_write(nc, "nc.shp")
https://gis.stackexchange.com/questions/294921/convert-shapefile-into-lat-long-points-and-export-it-as-csv-in-r

R: scaling Alaska using maptools::elide

I'm building a shapefile of states where Alaska and Hawaii are represented as being somewhere south of Texas, for ease of making an illustrative map. Using maptools package and some code from https://rud.is/b/2014/11/16/moving-the-earth-well-alaska-hawaii-with-r/, I have been able to do this with shapes from TIGER.
However, I am running into trouble now that I want to add cities to my map. Making my shapes into sp objects and then using maptools::elide works fine for Alaska before, but elide with scale doesn't work the same way on a collection of points so my cities wind up in the wrong place:
library(maptools)
library(sf)
library(tmap)
library(tidyverse)
library(tidygeocoder)
ak_city_sf <-
tribble(~city_name, ~city_search_string,
"Juneau", "Juneau, Alaska, United States",
"Anchorage", "Anchorage, Alaska, United States",
"Utqiagvik", "Utqiagvik, Alaska, United States",
"Scammon Bay", "Scammon Bay, Alaska, United States") %>%
geocode(city_search_string, method = 'osm', lat = latitude , long = longitude) %>%
st_as_sf(coords = c("longitude","latitude"))
st_crs(ak_city_sf) <- 4326
ak_city_sf <-
ak_city_sf %>%
st_transform(2163)
ak_state_sf <-
tigris::states(cb = T) %>%
filter(STUSPS == "AK") %>%
st_transform(2163)
# before transformation, everything looks fine...
tm_shape(ak_state_sf) +
tm_borders() +
tm_shape(ak_city_sf) +
tm_dots(size = .1) +
tm_text("city_name",
size = .5)
SCALE_FACTOR <- 10000
ak_state_sf_scaled <-
ak_state_sf %>%
as("Spatial") %>%
elide(scale = SCALE_FACTOR) %>%
st_as_sf()
st_crs(ak_state_sf_scaled) <- 2163
ak_city_sf_scaled <-
ak_city_sf %>%
as("Spatial") %>%
elide(scale = SCALE_FACTOR) %>%
st_as_sf()
st_crs(ak_city_sf_scaled) <- 2163
# after scaling, things don't look so good
tm_shape(ak_state_sf_scaled) +
tm_borders() +
tm_shape(ak_city_sf_scaled) +
tm_dots(size = .1) +
tm_text("city_name",
size = .5)
maptools::elide seems to be the best command for doing anything like this (even though it forces me to convert to an sp object). The documentation for scale doesn't mean much to me. (I don't think I can combine them in a single object because they are points for the cities and multipolygons for the state). How can I scale the points the same way I've scaled the state?
To scale or rotate two separate geometries so that they can be mapped together, it's necessary to define a centroid that both your state geometries and city geometries will be rotated or scaled around. This makes some logical sense, since scaling or rotating a collection of points (in this case representing cities) can't happen unless you define (implicitly or explicitly) the center of the scaling or rotating.
Once you've defined a common centroid (in my case, I just used the centroid of the states I was transforming), you can use the affine transformations shown here:
https://geocompr.robinlovelace.net/geometric-operations.html#affine-transformations.

How to determine if a point lies within an sf geometry that spans the dateline?

Using the R package sf, I'm trying to determine whether some points occur within the bounds of a shapefile (in this case, Hawai‘i's, EEZ). The shapefile in question can be found here. Unfortunately, the boundaries of the area in question span +/-180 longitude, which I think is what's messing me up. (I read on the sf website some business about spherical geometry in the new version, but I haven't been able to get that version to install. I think the polygons I'm dealing with are sufficiently "flat" to avoid any of those issues anyway). Part of the issue seems to be that my shapefile contains multiple geometries broken up by the dateline but I'm not sure how to combine them.
How do you tell, using sf, whether some points are inside of the bounds of some object in a shapefile (that happens to span the dateline)?
I have tried various combinations of st_shift_longitude to no avail. I have also tried transforming to what I think is a planar projection (2163), and that didn't work.
Here's how I'm currently trying to do this:
library(sf)
library(maps)
library(ggplot2)
library(tidyverse)
# this is the shapefile from the link above
eez_unshifted <- read_sf("USMaritimeLimitsAndBoundariesSHP/USMaritimeLimitsNBoundaries.shp") %>%
filter(OBJECTID == 1206) %>%
st_transform(4326)
eez_shifted <- read_sf("USMaritimeLimitsAndBoundariesSHP/USMaritimeLimitsNBoundaries.shp") %>%
filter(OBJECTID == 1206) %>%
st_transform(4326) %>%
st_shift_longitude()
# four points, in and out of the geometry, on either side of the dateline
pnts <- tibble(x=c(-171.952474,176.251978,179.006220,-167.922929),y=c(25.561970,17.442716,28.463375,15.991429)) %>%
st_as_sf(coords=c('x','y'),crs=st_crs(eez_unshifted))
# these all return false for every point
st_within(pnts,eez_unshifted)
st_within(st_shift_longitude(pnts),eez_unshifted)
st_within(pnts,eez_shifted)
st_within(st_shift_longitude(pnts),eez_shifted)
# these also all return false for every point
st_intersects(pnts,eez_unshifted)
st_intersects(st_shift_longitude(pnts),eez_unshifted)
st_intersects(pnts,eez_shifted)
st_intersects(st_shift_longitude(pnts),eez_shifted)
# plot the data just to show that it looks right
wrld2 <- st_as_sf(maps::map('world2', plot=F, fill=T))
ggplot() +
geom_sf(data=wrld2, fill='gray20',color="lightgrey",size=0.07) +
geom_sf(data=eez_shifted) +
geom_sf(data=st_shift_longitude(pnts)) +
coord_sf(xlim=c(100,290), ylim=c(-60,60)) +
xlab("Longitude") +
ylab("Latitude")
The answer is to make sure the geometry you're checking against is a polygon:
> eez_poly <- st_polygonize(eez_shifted)
> st_within(pnts,eez_poly)
although coordinates are longitude/latitude, st_within assumes that they are planar
Sparse geometry binary predicate list of length 4, where the predicate was `within'
1: 1
2: (empty)
3: 1
4: (empty)

map correct id to unique identifier in shp file coordinates

I have a shapefile, http://census.cso.ie/censusasp/saps/boundaries/Census2011_Small_Areas_generalised20m.zip
and want to extract the long/lat, but I am not sure how to map the correct coordinate to the correct small area.
mycode is:
require(ggplot2)
require(proj4)
require(rgdal)
a=readOGR(....shp)
dublin = a[a$NUS3NAME=='Dublin',]
dublin=spTransform(dublin,CRS('=proj=longlat +ellps=WGS84 +datum=WGS84'))
b=data.frame(dublin)
sa=fortify(dublin,SA='SMALL_AREA')
pj=project(sa[,1:2],proj4string(dublin),inverse=TRUE)
latlon=data.frame(latdeg=pj$y,londeg=pj$x)
sa=data.frame(cbind(latlon,sa)
The number of unique sa$id (4500) is the same as the number of unique b$SMALL_AREA (4500 rows). How is (for example) and id of 22 mapped from sa to the correct small area in b?
there are 56k rows in sa and 4500 rows in b
Any suggestions are appreciated
I am working in R
Shapefiles are much easier to work with and understand using the sf package in R. It keeps things tidy and rectangular, with the added $geometry list-column.
For your example, getting the lat & lon for the Dublin area:
library(sf)
library(tidyverse)
a <- read_sf('Census2011_Small_Areas_generalised20m/Census2011_Small_Areas_generalised20m.shp')
# dplyr filter() works for sf objects
dublin <- a %>% filter(NUTS3NAME == 'Dublin')
# Tranform to WGS84 coordinates
dublin <- dublin %>% st_transform(st_crs(4326))
# Proof CRS has changed
st_crs(dublin)
# lat/lon coords
st_coordinates(dublin) %>% head()
In this case, the sf geometry is of MULTIPOLYGON type. Each observation has between 4 and 168 connected lat/lon points associated with it. If you are interested in a single point for each observation, the centroid might be a good approximation.
Using dublin %>% st_centroid() will return all the data, but with the $geometry column consisting of a single point. Getting just the centroid points (as a matrix) can be achieved using dublin %>% st_centroid %>% st_coordinates().
Finally, a plot of the Dublin subset of the shapefile & the respective centroid points. There are quite a few shapes in a small area, making things hard to see. In the outskirts with larger polygons the centroids should be more visible.
dublin %>%
st_centroid() %>%
ggplot() +
geom_sf( size = .4, color = '#FF7900') +
geom_sf(data = dublin,
color = '#009A49',
fill = NA,
size = .2) +
theme(panel.background = element_rect(fill = "black")) +
coord_sf(datum = NA)

Resources