How to remove "almost holes" from polygons using R sf? - r

I have a polygon with some holes and some "almost holes" that are long, skinny inlets into the polygon. (polygon files HERE)
field <- sf::read_sf("example_field.kml")
ggplot() +
geom_sf(data = field) +
theme_void()
I can remove the one "true" hole from the polygon using:
field_no_holes <- nngeo::st_remove_holes(field)
ggplot() +
geom_sf(data = field_no_holes) +
theme_void()
However, that still leaves several long, skinny "almost holes". Any ideas about how to efficiently remove these? For reference, here is the what I'm going for as an end goal (I created this by manually deleting vertices).
field_fixed <- sf::read_sf("example_field_fixed.kml") %>%
nngeo::st_remove_holes()
ggplot() +
geom_sf(data = field_fixed) +
theme_void()

A couple of ways to do this below. You'll have to figure out how to play around with things to figure out how to define "skinny".
First is based on the #dieghernan comment, with an added step of un-buffering to return the shape to approximately the same size it started as:
libarary(tidyverse)
library(sf)
x <- read_sf('example_field.kml')
x %>%
st_transform(3857) %>% # crs 3857 allows buffering in meters
st_buffer(dist = 20) %>% # make the polygon 20m bigger to get rid of 'skinny' holes
st_buffer(dist = -20) %>% # shrink it back down to approximately the right size
ggplot() + # plotting
geom_sf(fill = 'black') +
geom_sf(data = x, fill = NA, col = 'red') +
theme_void()
Original polygon in red outline, new polygon filled in black:
This could run into some problems with concave points that aren't quite the right kind of 'skinny'. See the dent in the bottom right.
Another more laborious solution that might not get stuck on the 'skinny-ness' of holes. Take a regular sample of the polygon as points, then remove those that are too far from a buffered and un-buffered polygon. Turn the points that are leftover into a polygon.
# regular sample of 1000 points of the polygon
points <- x %>%
st_cast('LINESTRING') %>%
st_sample(size = 1000, type = 'regular') %>%
st_zm() %>% # remove z
st_cast('POINT')
# similar to solution 1, buffer & un-buffer original polygon
buffered <- x %>%
st_transform(3857) %>%
st_buffer(20) %>%
st_buffer(-20) %>%
st_transform(st_crs(points)) %>%
st_cast('LINESTRING')
# select sampled points that are very near the buffer,
# then cast them to a polygon
poly_from_points <- points[st_is_within_distance(points, buffered, dist = 5, sparse = F)] %>%
st_combine() %>%
st_cast('LINESTRING') %>%
st_cast('POLYGON')
With the example file given, this method might not be necessary. There could be edge cases where this is less likely to fail though.
Image below is zoomed in on the notch at the bottom right (south-east) to show the differences in the two methods along with the original polygon. Red is buffering & un-buffering (method 1), blue is only points from original polygon (method 2), and near-black is the original polygon.

Related

How to force text displayed by the tm_text function of R's tmap to remain inside polygon?

I'm using the tm_text function in the tmap package (for R) to draw maps and I want to force the text that is displayed to remain inside the polygon that it refers to. How do I achieve this with this function? (Some polygons are weirdly shaped and they end up having their text falling outside the polygon. The picture above illustrates an example; the 6th district has a weird shape. I would like to know if it is possible to have its label automatically placed inside the polygon as I need to plot a large number of maps.)
First of all your text label for e.g. 6 is outside the polygon since tmap's tm_text uses the polygon centroids as position to place the label and the centroid of the corresponding polygon is inside. Here's a reproduction of the centroids (blue dots) that are used (see left plot below):
library(tidyverse)
library(sf)
library(tmap)
# shape file data from:
# https://earthworks.stanford.edu/catalog/tufts-uscongdist107th01
df <- st_read("GISPORTAL_GISOWNER01_USCONGDIST107TH01.shp") %>%
filter(STATE == "AL") %>%
distinct(CONG_DIST, .keep_all = T)
# get polygon centroids to show how tmap works
df_points_centroid <- df %>%
mutate(centroid = st_centroid(geometry)) %>%
as.data.frame() %>%
select(-geometry) %>%
st_as_sf()
tm_shape(df) +
tm_borders() +
tm_fill(col = "gray") +
tm_text("CONG_DIST") +
tm_shape(df_points_centroid) +
tm_dots(col = "blue", size = .1) +
tm_layout(frame = F)
To use label positions which are always inside the polygon you can use st_point_on_surface() from the sf package (see red dots and labels in the right plot):
# get points guaranteed to be within the polygon
df_points_within <- df %>%
mutate(point_within = st_point_on_surface(geometry)) %>%
as.data.frame() %>%
select(-geometry) %>%
st_as_sf()
# show old and new text locations together
tm_shape(df) +
tm_borders() +
tm_fill(col = "gray") +
tm_shape(df_points_centroid) +
tm_dots(col = "blue", size = .1) +
tm_shape(df_points_within) +
tm_dots(col = "red", size = .1) +
tm_text("CONG_DIST") +
tm_layout(frame = F)

st_centroid renders all labels on the same point

I'm trying to display labels on GIS polygon features in R using the st_centroid function in the sf library. Unfortunately, while the head() function seems to show that each polygon has different x and y coordinates associated with it, all labels get rendered overlapping at a single point on the map (which is apparently the centroid of one particular polygon). What am I doing wrong here?
Current code setup:
library("ggplot2")
library("sf")
sf::sf_use_s2(FALSE) #makes centroids not break
world <- st_read("C:/prgrm/gis/source/10m_land_and_islands.shp")
prov <- st_read("C:/prgrm/gis/edited ncm/ncm_provinces.shp")
prov <- cbind(prov, st_coordinates(st_centroid(prov))) #attaches centroids to 'prov' dataset
head(prov)
ggplot(data = world) +
geom_sf() +
geom_sf(data=prov, aes(fill="blue")) +
geom_text(data=prov, aes(X,Y, label=provname_r), size=5) +
coord_sf(xlim=c(-2000000,1000000),ylim=c(-1500000, 3000000), crs=st_crs(3310))
You may be better off with specifying the centroid placement via fun.geometry argument of the geom_sf_text() call / by the way the default is sf::st_point_on_surface() - which is a good default as it makes sure that the label is not placed inside a hole, should the polygon have one.
Consider this example, using the well known & much loved nc.shp shapefile that ships with {sf}.
library(sf)
library(ggplot2)
# in place of your world dataset
shape <- st_read(system.file("shape/nc.shp", package="sf")) # included with sf package
# in place of your prov dataset
ashe <- shape[1, ]
ggplot(data = shape) +
geom_sf() +
geom_sf(data = ashe, fill = "blue") +
geom_sf_text(data = ashe,
aes(label = NAME),
color = "red",
fun.geometry = st_centroid)

how map certain USDA hardiness zones in R

Has anyone been able to create maps of a selection of USDA hardiness zones in R, maybe with ggplot2 and sf packages? I'd specifically like to create a map with only zones 9b and higher in color .
I think some of the data to create the map is found here Prism Climate Group, but I am inexperienced and at a loss to know what to do with GIS data (file extensions SGML,XML,DBF, PRJ, SHP,SHX).
To elaborate a little bit on the answer by #niloc:
The USA looks more natural when shown in the Albers conical projection (Canada border slightly curved - like in the original image).
This can be achieved by using coord_sf(crs = 5070) in your {ggplot2} call.
The gist of the answer (downloading, unzipping & plotting via ggplot2::geom_sf()) remains unchanged).
library(sf)
library(tidyverse)
library(USAboundaries)
# Download and unzip file
temp_shapefile <- tempfile()
download.file('http://prism.oregonstate.edu/projects/public/phm/phm_us_shp.zip', temp_shapefile)
unzip(temp_shapefile)
# Read full shapefile
shp_hardness <- read_sf('phm_us_shp.shp')
# Subset to zones 9b and higher
shp_hardness_subset <- shp_hardness %>%
filter(str_detect(ZONE, '9b|10a|10b|11a|11b'))
# state boundaries for context
usa <- us_boundaries(type="state", resolution = "low") %>%
filter(!state_abbr %in% c("PR", "AK", "HI")) # lower 48 only
# Plot it
ggplot() +
geom_sf(data = shp_hardness_subset, aes(fill = ZONE)) +
geom_sf(data = usa, color = 'black', fill = NA) +
coord_sf(crs = 5070) +
theme_void() # remove lat/long grid lines
There is a lot going on in that map with all of the insets, the legend with F and C, states displayed over the CONUS. Would be better to narrow down your question.
But here is a start. The shapefile is composed of many files (XML, DBF, etc) but you only need to point read_sf() at the .shp file. Subsetting with an sf object can be done just like with a data.frame.
library(sf)
library(tidyverse)
# Download and unzip file
temp_shapefile <- tempfile()
download.file('http://prism.oregonstate.edu/projects/public/phm/phm_us_shp.zip', temp_shapefile)
unzip(temp_shapefile)
# Read full shapefile
shp_hardness <- read_sf('phm_us_shp.shp')
# Subset to zones 9b and higher
shp_hardness_subset <- shp_hardness %>%
filter(str_detect(ZONE, '9b|10a|10b|11a|11b'))
# Plot it
ggplot() +
geom_sf(data = shp_hardness_subset, aes(fill = ZONE)) +
geom_polygon(data = map_data("state"), # add states for context
aes(x=long, y=lat,group=group),
color = 'black',
fill = NA) +
theme_void() # remove lat/long grid lines

Clip/Cut everything outside of Polygon or fill the outside with white

I have a square of colored noise with an triangle on it.
Now, I want the polygon to cut this noise like a "cookie cutter" on christmas. Resulting in an triangle of noise that is surrounded by a polygon path.
How can i clip all pixels that overlap the Polygon border and then save it as pdf?
I came up with 2 ideas:
Method 1 use a function that tests whether the pixel (colored noise) falls in the shape or not. Lets do it!
Problem: The edges of the border pixels go out of the line. In this example its quite minimal. You could argue just making the polygon-line a little bit bigger.
Method 2 Inverse the Polygon Shape (equal to: fill outside of polygon) and then fill it with white.
Problem:
In the plot preview window the result looks like i want it. When i save it as PDF i get an result of everything is white with the black polygon shapes.
Reproducible example:
library(magrittr)
library(ggplot2)
library(SDMTools)
polyGony <- c(0,0,100,50,50,100) %>% matrix(ncol=2,byrow = T) %>% as.data.frame()
deltaN <- 200 #grid width
sp1<-seq(1,100,length=deltaN)
sp2<-seq(1,100,length=deltaN)
sp<-expand.grid(x=sp1,y=sp2)
set.seed(1337)
sp$z <- sample(1:30,nrow(sp),replace = T)
# Method 1
outin = SDMTools::pnt.in.poly(sp[,1:2],polyGony)
outin$z <- sp$z
pointsInsideTri <- outin[outin$pip==1,-3]
p <- ggplot(pointsInsideTri, aes(x, y)) +
geom_raster(aes(fill = z)) +
scale_fill_gradientn(colours=c("#FFCd94", "#FF69B4", "#FF0000","#4C0000","#000000"))
p + geom_polygon(data = polyGony, aes(V1,V2),color="black", fill=NA) + theme(aspect.ratio = 1)
# Method 2
outSQ <-c(0,0,100,0,100,100,0,100)
invPolyGony <- c(outSQ,0,0,100,50,50,100) %>% matrix(ncol=2,byrow = T) %>% as.data.frame()
p <- ggplot(sp, aes(x, y)) +
geom_raster(aes(fill = z)) +
scale_fill_gradientn(colours=c("#FFCd94", "#FF69B4", "#FF0000","#4C0000","#000000"))
p + geom_polygon(data = invPolyGony, aes(V1,V2) ,colour="black", fill="white") + theme(aspect.ratio = 1)
i now know what the problem was. In order to fill everything outside of a polygon, the path (the hole in the middle) needs to "run" clockwise, the outter border needs to run counter-clockwise.
I made a simple example. We have a polygon of a star. I want everything outside of the star to be red.
star <- c(25.000,1.000,31.000,18.000,49.000,18.000,35.000,29.000,40.000,46.000,
25.000,36.000,10.000,46.000,15.000,29.000,1.000,18.000,19.000,18.000) %>% matrix(ncol=2, byrow=T)
star <- rbind(star,star[1,])
rim <- c(0,0, 50,0, 50,50,0,50,0,0) %>% matrix(ncol=2, byrow=T)
datapolyM <- rbind(rim,star) %>% as.data.frame()
names(datapolyM) <- c("x","y")
ggplot(datapolyM, aes(x=x, y=y)) +
geom_polygon(fill="red", colour="black")
Export to pdf! You will see that the whole image is filled red!
Now lets turn the path of the star to run clockwise: Please respect the apply and reverse command in the second line:
star <- c(25.000,1.000,31.000,18.000,49.000,18.000,35.000,29.000,40.000,46.000,
25.000,36.000,10.000,46.000,15.000,29.000,1.000,18.000,19.000,18.000) %>% matrix(ncol=2, byrow=T) %>% apply(2, rev)
star <- rbind(star,star[1,])
rim <- c(0,0, 50,0, 50,50,0,50,0,0) %>% matrix(ncol=2, byrow=T)
datapolyM <- rbind(rim,star) %>% as.data.frame()
names(datapolyM) <- c("x","y")
datapolyM$id <- "a"
ggplot(datapolyM, aes(x=x, y=y)) +
geom_polygon(fill="red")
Now export to pdf again. You will see it worked this time! You have filled everything outside of a given polygon-shape!

Plotting Shapefile on ggmap

I am attempting to plot several shapefiles on top of a map generated through ggmap. This is working well, however I want to constrain the view area to the shapefile (and not rely on the zoom argument in ggmaps). I've done this by getting the bounding box and passing it as an argument in ggplot's coord_cartesian While this works, I am getting some tearing issues on the edges of the map - most specifically on the western portion. I've tried adjusting the x-y coordinates manually but it seems to only severely distort the picture.
My thoughts are to zoom out slightly to allow the entire shapefile to be plotted in the area, but I can't seem to figure it out. It's also possible I am going about this entirely in the wrong way.
Here's the code I used to generate the map. The shapefile can be downloaded here
library(dplyr)
library(ggmap)
library(rgdal)
library(broom)
# Read in shapefile, project into long-lat
# Create 'tbox' which is a minimum bounding box around the shapefile
tracts <- readOGR(dsn = ".", layer = "CensusTracts2010") %>%
spTransform("+proj=longlat +ellps=WGS84")
tbox <- bbox(tracts)
# Plot data
tract_plot <- tidy(tracts)
DetroitMap <- qmap("Detroit", zoom = 11)
DetroitMap + geom_polygon(data = tract_plot, aes(x = long, y = lat, group = id), color = "black", fill = NA) +
coord_cartesian(xlim = c(tbox[1,1], tbox[1,2]),
ylim = c(tbox[2,1], tbox[2,2]))
I followed your workflow, which resulted in the same problem as you mentioned above. Then I changed the zoom on the qmap option from 11 to 10 and it resulted in a much better picture, although you do lose some of the place names but you can add those in manually yourself with annotate:
DetroitMap <- qmap("Detroit", zoom = 10)
DetroitMap + geom_polygon(data = tract_plot, aes(x = long, y = lat, group = id), color = "black", fill = NA) +
coord_cartesian(xlim = c(tbox[1,1], tbox[1,2]),
ylim = c(tbox[2,1], tbox[2,2]))

Resources