Counting polygons in shapefile - r

So I have multiple species ranges which look like the following (colored blue for example) This one runs east to west across Africa:
I can get the total area by using gArea in the rgeos package. What I want to know is how many individual polygons make up this file - i.e. how many distinct regions are there to the total range (this could be islands, or just separated populations) and what the ranges of those polygons are. I have been using the following code:
#Load example shapefile
shp <- readShapeSpatial("species1.shp")
#How many polygon slots are there?
length(shp#polygons)
>2
#How many polygons are in each slot
length(shp#polygons[[1]]#Polygons
length(shp#polygons[[2]]#Polygons
and to get the area of a particular one:
shp#polygons[[1]]#Polygons[[1]]#area
Is this correct? I'm worried that a lake in the middle of the range might constitute a polygon on its own? I want to end up with a list that is roughly like:
Species A Species B
Polygon 1 12 11
Polygon 2 13 10
Polygon 2 14 NA
If I wanted to compile a list for every species of how many polygons and their individual ranges would be pretty straightforward to pass to a loop if the above code is correct.
Thanks

This is a very un-glamorous solution, but it gets the job done at the moment.
for(i in 1:length(shpfiles)){
shp <- shpfiles[[i]]
#1) Create master list of all polygon files within a shapefile
#How many lists of polygons are there within the shpfile
num.polygon.lists <- length(shp#polygons)
#Get all polygon files
master <- vector(mode="list")
for(i in 1:num.polygon.lists){
m <- shp#polygons[[i]]#Polygons
master[[i]] <- m
}
#Combine polygon files into a single list
len <- length(master)
if(len > 1) {
root <- master[[1]]
for(i in 2:length(master)){
root <- c(master[[i]], root)}
} else {root <- master[[1]]}
#Rename
polygon.files <- root
#2) Count number of polygon files that are not holes
#Create a matrix for the total number of polygon slots that are going to be counted
res <- matrix(NA,ncol=1 , nrow=length(polygon.files))
#Loop through polygons returning whether slot "hole" is TRUE/FALSE
for(i in 1:length(polygon.files)){
r <- isTRUE(polygon.files[[i]]#hole)
res[[i,1]] <- r
}
#Count number times "FALSE" appears - means polygon is not a hole
p.count <- table(res)["FALSE"]
p.count <- as.numeric(p.count)
print(p.count)
}
It's a start

I used the following code to find out how many multipart polygons there were in each "row" of a shapefile...
sapply(shapefile#polygons, function(p) length(p#Polygons))

Related

How to filter out each polygon from its list of neighbors with sf::st_intersects

I need to extract the neighboring polygons for each polygon in an sf dataset.
Here's a quick example:
library(tidyverse)
library(sf)
demo(nc, ask = FALSE, verbose = FALSE)
nc <- nc %>%
mutate(polygon_id = row_number())
I have managed to extract the neighbors with sf::st_intersects
neighbors <- st_intersects(nc, nc)
neighbors[[5]]
[1] 5 6 9 16 28
The issue is that each polygon (here, 5) is being included in the list of neighbors. Using only one nc dataset gives me the same result
neighbors <- st_intersects(nc)
neighbors[[5]]
[1] 5 6 9 16 28
Any tips on how to filter out the actual polygon from the list of adjacent/neighboring polygons?
Good question. This question could have many solutions. But the simple answer for this question: "Any tips on how to filter out the actual polygon from the list of adjacent/neighboring polygons?", was accomplished using the Jupyter Lab IDE with the R kernel. The following code provides one way to answer the question.
There are 100 counties in the nc dataset. This code displays the selected county in color and shows all the neighboring counties. This code works for any of the 100 counties in nc. The county 100 was selected here.
Code:
nc1 <- nc %>% mutate(c_id = 1:nrow(nc))
n = 100
grp <- st_intersects(nc1, nc1[n,1] , sparse = F )
neighborhood <- nc1[grp,]
neighborhood
plot(neighborhood$geom)
plot(nc1[n,1], col = 'blue', add = TRUE) #
This code is easily extended. I wrote a quick little function that displays the names of the neighboring countries (not shown here), but this question seems to be most likely a plotting related question.
The plot is shown at Link

How do I remove a subset of polygons from a Large SpatialPolygonsDataFrame using a string search, in R?

I have a spatial file in R, that contains all the area units for New Zealand. I have downloaded it in NZGD2000 format. In this file I have irrelevant geographic details, such as the Oceanic regions. I have managed to remove those from my data by simply removing those polygons with higher than a certain value.
library("dplyr")
library("rgdal")
library("rgeos")
NZAreas <- readOGR("[FILEPATH]/area-unit-2013.shp")
#remove the areas that are offshore
NZAreas#data$AU2013_V1_ <- as.numeric(as.character(NZAreas#data$AU2013_V1_))
NZAreas <- NZAreas[NZAreas#data$AU2013_V1_ < 614000,]
I have the problem that the area units include inlets and inland water. I can't remove those in the same way as I removed the coastal units, as the area unit values are not continguous. The #data$$AU2013_V_1 contains the labels for the area units. All the area units I wish to remove have the label starting with "Inlet" or "Inland Water".
I can't work out how to remove these polygons from the data.
First I tried without the dataframe name in front of the #data:
NZAreas <- NZAreas[!grepl("Inlet", #data$AU2013_V_1),]
Error: unexpected '#' in "NZAreas <- NZAreas[!grepl("Inlet", #"
and then I tried:
NZAreas <- NZAreas[!grepl("Inlet", NZAreas#data$AU2013_V_1),]
That second code runs but does not remove the polygons; it does not seem to do anything to the Large SpatialPolygonDataFrame. I checked the dataframe I constructed off NZAreas and there are Inlet and Inland Water rows. How do I remove these polygons?
This should work. It removed 49 areas containing "Inlet" in label and 15 areas having "Inland Water" in label.
> dim(NZAreas)
[1] 2004 5
> NZAreas=NZAreas[!grepl("Inlet", NZAreas$AU2013_V_1),]
> dim(NZAreas)
[1] 1955 5
> NZAreas=NZAreas[!grepl("Inland Water", NZAreas$AU2013_V_1),]
> dim(NZAreas)
[1] 1940 5
>

Dividing Individual Spatial Polygons Equally in R

I have a shapefile of polygons that are the townships in the state of Iowa.I'd like to divide each element (ie each township) into 9 equal parts (i.e. a 3 x 3 grid for each township). I've figured out how to do this, but am having trouble forming a new dataframe out of the new polygons. My code is below. The data can be downloaded here: https://ufile.io/wi6tt
library(sf)
library(tidyverse)
setwd("~/Desktop")
iowa<-st_read( dsn="Townships/iowa", layer="PLSS_Township_Boundaries", stringsAsFactors = F) # import data
## Make division
r<-NULL
for (row in 1:nrow(iowa)) {
r[[row]]<-st_make_grid(iowa[row,],n=c(3,3))
}
# Combine together
region<-NULL
for (row in 1:nrow(iowa)) {
region<-rbind(region,r[[row]])
}
region<-st_sfc(region,crs=4326) #convert to sfc
reg_id<-data.frame(reg_id=1:length(region)) #make ID for dataframe
# Make SF
region_df<-st_sf(reg_id,region)
The last line gives the following error:
Error in `[[<-.data.frame`(`*tmp*`, all_sfc_names[i], value = list(list( : replacement has 1644 rows, data has 14796
1664 is the number of rows in the initial Iowa dataframe.
Clearly the number of rows does not match the number of elements.
This might be a general r thing, rather than a spatial one, but I figured I'd post the whole thing in case someone had an idea on how to do the entirety of this a little cleaner

spatial join on two simple features {sf} with over 1 mil. entries as fast as possible

I hope this is not too trivial but I really can't find an answer and I'm too new to the topic to come up with alternatives myself. So here is the Problem:
I have two shapefiles x and y that represent different processing levels of a Sentinel2 satellite image.
x contains about 1.300.000 polygons/Segments completely covering the image extend without any further vital information.
y contains about 500 polygons representing the cloud-free area of the image (also covering most of the image except for a few "cloud-holes") as well as information about the used image in 4 columns (Sensor, Time...)
I'm trying to add the image information to x in places x is covered by y. pretty simple? I just can't find a way to make it happen without taking days.
I read x in as a simple feature {sf}, as reading it with shapefile / readOGR takes ages.
I tried different things with y
when I try merge(x,y) I can only take one sf as merge doesn't support two sf's.
merging x (as sf) and y (as shp) gives me the error "cannot allocate vector of size 13.0 Gb"
so I tried sf::st_join(x,y), which supports both Variables to be sf but still didn't finish for 28 hours now
sf::st_intersect(x,y) took about 9 minutes for a 10.000 segment subset, so that might not be a lot faster for the whole piece.
could subsetting x to a few smaller pieces solve the whole thing or is there another simple solution? could I do something with my workspace to make the merge work or is there simply no shortcut to joining that amount of polygons?
Thanks a lot in advance and I hope my description isn't too fuzzy!
my tiny work station:
win 7 64 bit
8 GB RAM
intel i7-4790 # 3,6 GHz
I often face this kind of problems and as #manotheshark2 afirms, I prefer to work in a loop subseting my vector layer. Here is my advice:
Load your data
library(raster)
library(rgdal)
x <- readOGR('C:/', 'sentinelCovers')
y <- readOGR('C:/', 'cloudHoles')
Assign an y ID for identify which x polygons intersects y polygons and create the column in x table
x$xyID <- NA # Answer col
y$yID <- 1:nrow(y#data) # ID col
Run a loop subseting x
for (posX in 1:nrow(x#data)){
pol.x <- x[posX, ]
intX <- raster::intersect(pol.x, y)
# x$xyID[posX] <- intX#data$yID ## Run this if there's unique y polygons
# x$xyID[posX] <- paste0(intX#data$yID, collapse = ',') ## Run this if there's multiple y polygons
}
You can check if is better to run the loop on x o y layer
x$xyID <- NA # Answer col
x$xID <- 1:nrow(x#data) # ID Col
for (posY in 1:nrow(y#data)){
pol.y <- y[posY, ]
intY <- tryCatch(raster::intersect(pol.y, x), finally = 'NULL')
if (is.null(intY)) next
x$xyID[x#data$xID %in% intY#data$xID] <- pol.y$yID
}

Merging (two and a half) countries from maps-package to one map object in R

I am looking for a map that combines Germany, Austria and parts of Switzerland together to one spatial object. This area should represent the German speaking areas in those three countries. I have some parts in place, but can not find a way to combine them. If there is a completely different solution to solve this problem, I am still interested.
I get the German and the Austrian map by:
require(maps)
germany <- map("world",regions="Germany",fill=TRUE,col="white") #get the map
austria <- map("world",regions="Austria",fill=TRUE,col="white") #get the map
Switzerland is more complicated, as I only need the 60-70% percent which mainly speak German. The cantones that do so (taken from the census report) are
cantonesGerman = c("Uri", "Appenzell Innerrhoden", "Nidwalden", "Obwalden", "Appenzell Ausserrhoden", "Schwyz", "Lucerne", "Thurgau", "Solothurn", "Sankt Gallen", "Schaffhausen", "Basel-Landschaft", "Aargau", "Glarus", "Zug", "Zürich", "Basel-Stadt")
The cantone names can used together with data from gadm.org/country (selecting Switzerland & SpatialPolygonsDataFrame -> Level 1 or via the direct link) to get the German-speaking areas from the gadm-object:
gadmCH = readRDS("~/tmp/CHE_adm1.rds")
dataGermanSwiss <- gadmCH[gadmCH$NAME_1 %in% cantonesGerman,]
I am now missing the merging step to get this information together. The result should look like this:
It represents a combined map consisting of the contours of the merged area (Germany + Austria + ~70% of Switzerland), without borders between the countries. If adding and leaving out the inter-country borders would be parametrizable, that would be great but not a must have.
You can that like this:
Get the polygons you need
library(raster)
deu <- getData('GADM', country='DEU', level=0)
aut <- getData('GADM', country='AUT', level=0)
swi <- getData('GADM', country='CHE', level=1)
Subset the Swiss cantons (here an example list, not the correct one); there is no need for a loop for such things in R.
cantone <- c('Aargau', 'Appenzell Ausserrhoden', 'Appenzell Innerrhoden', 'Basel-Landschaft', 'Basel-Stadt', 'Sankt Gallen', 'Schaffhausen', 'Solothurn', 'Thurgau', 'Zürich')
GermanSwiss <- swi[swi$NAME_1 %in% cantone,]
Aggregate (dissolve) Swiss internal boundaries
GermanSwiss <- aggregate(GermanSwiss)
Combine the three countries and aggregate
german <- bind(deu, aut, GermanSwiss)
german <- aggregate(german)

Resources