Automatically reclassify small SpatialPolygons inserted into large SpatialPolygons using R - r

I would like to assign small polygons nested in larger polygons the same values as those of larger polygons.
In figure 1 you can see the small polygons in raster format:
and in figure 2 in SpatialPolygons as individual polygons:
These polygons are results of sorting by k-means, generating raster, and using the rasterFromXYZ function (code below):
mydata.26.raster <- rasterFromXYZ([,c("x", "y", "cls_26.cluster")]),res=5, crs=crs)
and then rasterToPolygons function I was able to separate the polygons (code below):
zona.26.pol<- rasterToPolygons(zona.26.raster$cls_26.cluster,dissolve=TRUE)
zona.26.pol <- disaggregate(zona.26.pol)
Here's zona.26.pol if you want to help It is in .shp format.
And manually I reclassified the polygons and finally added them using the same classes.
After manually assigning the values by me, the result that I would like to achieve automatically (creating rules) is in figure 3:
Every help is welcome!

This will remove the small nested polygons based on their size alone and then remove the holes left in the larger, remaining polygons. This works for your example but may fail if you have larger nested polygons you are wanting to remove. In that case, we would have to figure out how to identify the geometries that are 'nested'
min_polygon_area <- 10000 #set minimum size of a nested polygon you would like to remove
units(min_polygon_area) <- as_units('m^2') #as defined below by st_area
zona.26.pol <- st_read(file.path(workDir, 'zona.26.pol.shp'))
st_crs(zona.26.pol) <- '+proj=utm +zone=22 +south +ellps=WGS84 +datum=WGS84 +units=m +no_defs' #define crs
zona.26.pol$area <- st_area(zona.26.pol)
zona.26.pol$area #note area is in m^2
plot(zona.26.pol[,'cls_26_']) #as-is plot
zona.26.pol <- zona.26.pol[zona.26.pol$area>min_polygon_area, ]
plot(zona.26.pol[,'cls_26_']) #small polygons removed; holes remaining
zona.26.pol_no_holes <- st_remove_holes(zona.26.pol)
plot(zona.26.pol_no_holes[,'cls_26_']) #holes removed
Note that I used the sf package to read-in the shapefile, in order to utilize the st_remove_holes function from the nngeo package, but I typically use raster and sp packages.


Query raster brick layer based on another raster in R

I have a NetCDF file of global oceanographic (OmegaA) data at relatively coarse spatial resolution with 33 depth levels. I also have a global bathymetry raster at much finer resolution. My goal is to use get the seabed OmegaA data from the NetCDF file, using the bathymetry data to determine the desired depth. My code so far;
# Aragonite data. Defaults to CRS WGS84
ncin <- nc_open("C:/..../")
ncin.depth <- ncvar_get(ncin, "Depth")# 33 depth levels
omegaA.brk <- brick("C:/.../")
omegaA.brk <-rotate(omegaA.bkr)# because netCDF is in Lon 0-360.
# depth raster. CRS WGS84
# resample the raster brick to the resolution that matches the bathymetry raster
omegaA.brk <-resample(omegaA.brk, r, method="bilinear")
# create blank final raster
omegaA.rast <- raster(ncol = r#ncols, nrow = r#nrows)
extent(omegaA.rast) <- extent(r)
omegaA.rast[] <- NA_real_
# create vector of indices of desired depth values
# loop to find appropriate raster brick layer, and extract the value at the desired index, and insert into blank raster
for (p in depth.values.index) {
dep.index <-which(abs(ncin.depth+depth.values[p]) == min(abs(ncin.depth+depth.values[p]))) ## this sometimes results in multiple levels being selected
brk.level <-omegaA.brk[[dep.index]] # can be more than on level if multiple layers selected above.
omegaA.rast[p] <-omegaA.brk[[1]][p] ## here I choose the first level if multiple levels have been selected above
print(paste(p, "of", length(depth.values.index))) # counter to look at progress.
The problem: The result is a raster with massive gaps (NAs) in it where there should be data. The gaps often take a distinctive shape - eg, follow a contour, or along a long straight line. I've pasted a cropped example.
enter image description here
I think this could be because either 1) for some reason the 'which' statement in the loop is not finding a match or 2) a misalignment of the projections is created which I've read can happen when using 'Rotate'.
I've tried to make sure all the extents, resolutions, number of cells, and CRS's are all the same, which they seem to be.
To speed up the process I've cropped the global brick and bathy raster to my area of interest, again checking that all the spatial resolutions, etc etc match - I've not included those steps here for simplicity.
At a loss. Any help welcome!
Without a reproducible example, this kind of problems is hard to solve. I can't tell where your problem is but I'll present to you the approach I would try. Maybe it's good, maybe it's bad, I don't know but it may inspire you to find a way to go around your problem.
To my understanding, you have a brick of OmegaA (33 layers/depth) and a bathymetry raster. You want to get the OmegaA value at the bottom of the sea. Here is how I would do:
Make OmegaA raster to the same resolution and extent to the bathymetry one
Transforme the bathymetry raster into a raster brick of 33 three layers of 0-1. e.g. If the sea bottom is at 200m for one particular pixel, than this pixel on all depth layer other than 200 is 0 and 1 for the 200. To program this, I would go the long way, something like
r_1 <- r
values(r_1) <- values(r)==10 # where 10 is the depth (it could be a range with < or >)
r_2 <- r
values(r_2) <- values(r)==20
r_33 <- r
values(r_33) <- values(r)==250
r_brick <- brick(r_1, r_2, ..., r_33)
then you multiple both your raster bricks. They have the same dimension, it should be easy. The output should be a raster brick of 33 layers with 0 everywhere where it isn't the bottom of the sea and the value of OmegaA anywhere else.
Combine all the layer of the brick obtained previously into a simple raster with a sum.
This should work. If you have problem with dealing with raster brick, you could make the data into base R arrays, it could be simpler.
Good luck.

Assigning spatial coordinates to an array in R

I have downloaded a text file of data from the following link:
After unzipping I use the following lines of code to plot up the data:
# load libraries
# function to rotate a matrix (and transpose)
rotate <- function(x) t(apply(x, 2, rev))
# read data
data <- as.matrix(read.table("~/Downloads/us_rn_50km.txt", skip=6))
data[data<=0] <- NA
# rotate data
data <- rotate(data)
# plot data
mean.rn <- mean(data, na.rm=T)
image.plot(data, main=paste("Mean Rn emissions =", sprintf("%.3f", mean.rn)) )
This all looks OK, but I want to be able to plot the data on a lat-long grid. I think I need to convert this array into an sp class object but I don't know how. I know the following (from the web site): "The projection used to project the latitude and longitude coordinates is that used for the Decade of North American Geology (DNAG) maps. The projection type is Spherical Transverse Mercator with a base latitude of zero degrees and a reference longitude of 100 degrees W. The scale factor used is 0.926 with no false easting or northing. The longitude-latitude datum is NAD27 and the units of the xy-coordinates are in meters. The ellipsoid used is Clarke 1866. The resolution of the map is 50x50km". But don't know what to do with this data. I tried:
data.sp <- SpatialPoints(data, CRS("+proj=longlat+ellps=clrk66+datum=NAD27") )
But had various problems (with NA's) and fundamentally I think that the data isn't in the right format.. I think that the SpatialPoints function wants a data on location (in 2-D) and a third array of values associated with those locations (x,y,z data - I guess my problem is working out the x and the y's from my data!)
Any help greatly appreciated!
The file in question is an ASCII raster grid. Coordinates are implicit in this format; a header describes the position of the (usually) lower left corner, as well as the grid dimensions and resolution. After this header section, values separated by white space describe how the variable varies across the grid, with values given in row-major order. Open it in a text editor if you're interested.
You can import such files to R with the fantastic raster package, as follows:
destfile={f <- tempfile()})
unzip(f, exdir=tempdir())
r <- raster(file.path(tempdir(), 'us_rn_50km.txt'))
You can plot it immediately, without assigning the projection:
If you didn't want to transform it to another CRS, you wouldn't necessarily need to assign the current coordinate system. But since you do want to transform it to WGS84 (geographic), you need to first assign the CRS:
proj4string(r) <- '+proj=tmerc +lon_0=-100 +lat_0=0 +k_0=0.926 +datum=NAD27 +ellps=clrk66 +a=6378137 +b=6378137 +units=m +no_defs'
Unfortunately I'm not entirely sure whether this proj4string correctly reflects the info given at the website that provided the data (it would be great if they actually provided the definition in a standard format).
After assigning the CRS, you can project the raster with projectRaster:
r.wgs84 <- projectRaster(r, crs='+init=epsg:4326')
And if you want, write it out to a raster format of your choice, e.g.:
writeRaster(r.wgs84, filename='whatever.tif')

How to get count of non-NA raster cells within polygon

I've been running into all sorts of issues using ArcGIS ZonalStats and thought R could be a great way. Saying that I'm fairly new to R, but got a coding background.
The situation is that I have several rasters and a polygon shape file with many features of different sizes (though all features are bigger than a raster cell and the polygon features are aligned to the raster).
I've figured out how to get the mean value for each polygon feature using the raster library with extract:
#load packages required
# ---Set the working directory-------
datdir <- "/test_data/"
#Read in a ESRI grid of water depth
ras <- readGDAL("test_data/raster/pl_sm_rp1000/w001001.adf")
#convert it to a format recognizable by the raster package
ras <- raster(ras)
#read in polygon shape file
proxNA <- readShapePoly("test_data/proxy/PL_proxy_WD_NA_test")
#plot raster and shp
#calc mean depth per polygon feature
#unweighted - only assigns grid to district if centroid is in that district
proxNA#data$RP1000 <- extract(ras, proxNA, fun = mean, na.rm = TRUE, weights = FALSE)
#check results
#plot depth values
The issue I have is that I also need an area based ratio between the area of the polygon and all non NA cells in the same polygon. I know what the cell size of the raster is and I can get the area for each polygon, but the missing link is the count of all non-NA cells in each feature. I managed to get the cell number of all the cells in the polygon proxNA#data$Cnumb1000 <- cellFromPolygon(ras, proxNA)and I'm sure there is a way to get the actual value of the raster cell, which then requires a loop to get the number of all non-NA cells combined with a count, etc.
BUT, I'm sure there is a much better and quicker way to do that! If any of you has an idea or can point me in the right direction, I would be very grateful!
I do not have access to your files, but based on what you described, this should work:
masked_img=mask(nonNA_raster,mask_layer) #based on centroid location of cells
nonNA_count=cellStats(masked_img, sum)

How to pick up the information for the nearest associated polygon to points using R?

I'm figuring out how to do a Intersection (Spatial Join) between point and polygons from shapefiles. My idea is to get the closest points and those points that match completely inside the polygons. In ARGIS there's a function for match option named CLOSEST and they have defined by: "The feature in the join features that is closest to a target feature is matched. It is possible that two or more join features are the same distance away from the target feature. When this situation occurs, one of the join features is randomly selected as the matching feature."
I have a function to intersect points into polygons, it was kindly contributed by Lyndon Estes at the r-sig-geo list and the code works very well when all the polygons have filled all the area. The second case is known as a Spatial join distance and in ArcGIS is know as INTERSECT when match_option is CLOSEST, as ArcGIS does. So, you can modify the minimal distance between the point and the polygon when the area is not filled by all polygons.
Here's the data and the function of the first INTERSECT:
library(sp) <- readShapeSpatial("") <- readShapeSpatial("" )
IntersectPtWithPoly <- function(x, y) {
# Extracts values from a SpatialPolygonDataFrame with SpatialPointsDataFrame, and appends table (similar to
# ArcGIS intersect)
# Args:
# x: SpatialPoints*Frame
# y: SpatialPolygonsDataFrame
# Returns:
# SpatialPointsDataFrame with appended table of polygon attributes
# Set up overlay with new column of join IDs in x
z <- overlay(y, x)
# Bind captured data to points dataframe
x2 <- cbind(x, z)
# Make it back into a SpatialPointsDataFrame
# Account for different coordinate variable names
if(("coords.x1" %in% colnames(x2)) & ("coords.x2" %in% colnames(x2))) {
coordinates(x2) <- ~coords.x1 + coords.x2
} else if(("x" %in% colnames(x2)) & ("x" %in% colnames(x2))) {
coordinates(x2) <- ~x + y
# Reassign its projection if it has one
if( == "FALSE") {
x2#proj4string <- x#proj4string
test<-IntersectPtWithPoly (,
Sharing some ideas with Lyndon, he told me this:
I think the easiest thing to do would be to put a buffer around each of the points (you could specify 50 m if it is in projected coordinates), converting them to polygons, and then your task becomes an intersection of two different polygon objects.
I haven't done this type of operation in R, but I suspect you could find your answer with the following functions:
I suggest putting up a subset of your data illustrating the problem, and then maybe someone else who has a better idea on polygon to polygon intersects/overlays could suggest the method.
should be made in the points radius which are in the shapefile in order to make them get into the nearest polygon.
I know that this functions could help to achive it.
I'm working on it, so any comment or help, would be very apreciated!
I have got that it's possible doing polygon to polygon overlays using sp andrgeos. You'd need to load rgeos after you load sp.
over(polygon1, polygon2)

Intersecting Points and Polygons in R

I am working with shapefiles in R, one is point.shp the other is a polygon.shp.
Now, I would like to intersect the points with the polygon, meaning that all the values from the polygon should be attached to the table of the point.shp.
I tried overlay() and spRbind in package sp, but nothing did what I expected them to do.
Could anyone give me a hint?
With the new sf package this is now fast and easy:
out <- st_intersection(points, poly)
Additional options
If you do not want all fields from the polygon added to the point feature, just call dplyr::select() on the polygon feature before:
poly %>%
select(column-name1, column-name2, etc.) -> poly
out <- st_intersection(points, poly)
If you encounter issues, make sure that your polygon is valid:
If you see some FALSE outputs here, try to make it valid:
poly <- st_make_valid(poly)
Note that these 'valid' functions depend on a sf installation compiled with liblwgeom.
If you do overlay(pts, polys) where pts is a SpatialPointsDataFrame object and polys is a SpatialPolygonsDataFrame object then you get back a vector the same length as the points giving the row of the polygons data frame. So all you then need to do to combine the polygon data onto the points data frame is:
o = overlay(pts, polys)
pts#data = cbind(pts#data, polys[o,])
HOWEVER! If any of your points fall outside all your polygons, then overlay returns an NA, which will cause polys[o,] to fail, so either make sure all your points are inside polygons or you'll have to think of another way to assign values for points outside the polygon...
You do this in one line with fom spatialEco package.
new_shape <-, polys)
from the documentation: "intersects point and polygon feature classes and adds polygon attributes to points".
