I want to calculate the percentage area of habitat suitability of a species that overlaps with protected area polygons. I do not know the R language very well, but here is what I have so far.
These are the attributes of the area of habitat suitability derived from a maxent prediction:
class : RasterLayer
dimensions : 6480, 8520, 55209600 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -103, -32, -36, 18 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +ellps=WGS84
of the protected areas:
Simple feature collection with 5667 features and 2 fields (with 8 geometries empty)
geometry type: GEOMETRY
dimension: XY
bbox: xmin: -118.6344 ymin: -59.85538 xmax: -25.29094 ymax: 32.48333
CRS: +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
Does someone know a way to calculate the percentage area of habitat suitability that overlaps with protected area polygons?
Sorry, I really do not know so much about how to work with these data. I hope I gave all the relevant information.
I would appreciate any input.
To answer your first question, you should be able to use zonal statistics to calculate the area of potential habitat found in protected areas using the spatialEco package:
zonal.stats(x, y, stats = c("min", "mean", "max"))
#x = Polygon object of class SpatialPolygonsDataFrame
#y = rasterLayer object of class raster
https://www.rdocumentation.org/packages/spatialEco/versions/1.3-0/topics/zonal.stats
Here is a reproducible example from the spatialEco package that first calculates the percentage of pixels in each polygon >= a threshold value and second calculates the sum of pixels in each polygon >= the threshold value used to reclassify the input raster. You might be interested in both avenues for your work.
library(spatialEco)
library(raster)
library(sp)
# here the fxn will calculate the percentage of cells >= 0.5
# percent x >= p function
pct <- function(x, p=0.50, na.rm = FALSE) {
if ( length(x[x >= p]) < 1 ) return(0)
if ( length(x[x >= p]) == length(x) ) return(1)
else return( length(x[x >= p]) / length(x) )
}
# create some example data
p <- raster(nrow=10, ncol=10)
p[] <- runif(ncell(p)) * 10
p <- rasterToPolygons(p, fun=function(x){x > 9})
r <- raster(nrow=100, ncol=100)
r[] <- runif(ncell(r))
plot(r)
plot(p, add=TRUE, lwd=4)
# run zonal statistics using pct functions
z.pct <- zonal.stats(x=p, y=r, stats = "pct")
z.pct
#Alternatively, reclassify the raster based on a threshold
r.c<-reclassify(r, c(-Inf, 0.5, 0, 0.5, Inf, 1)) #all values >0.5 reclassified to 1
plot(r.c)
plot(p, add=TRUE, lwd=4) #add poly to the plot
# run zonal stats and calculate sum of cells in each poly
z.sum <- zonal.stats(x=p, y=r.c, stats = "sum")
z.sum
Related
I have two raster layers for same area. I need to find the Euclidean distance between cell of coarse resolution raster and cell of fine resolution raster that fall within each cell of the pixels from my coarse resolution raster. For example:
The red square is the pixel of coarse resolution raster while the blue squares are the pixels of fine resolution raster. The black dot is the centroid of coarse resolution raster and the blue dots are the centroids of fine resolution raster.
There are similar questions posted, but the difference with my question is that I don't want to compute the nearest distances between raster cells.
My coarse resolution raster has a pixel size of 460m and my fine resolution raster of 100m. What I have done so far is to create point symbols from the centroids of the raster cells for both rasters. How can I compute the Euclidean distance between each coarse pixel and its corresponding fine pixels?
library(terra)
fr = rast("path/fine_image.tif") # fine resolution raster
cr = rast("path/coarse_image.tif") # coarse resolution raster
fr_p = as.points(fr,
values = T,
na.rm = T,
na.all = F) # fine resolution points
cr_p = as.points(cr,
values = T,
na.rm = T,
na.all = F) # coarse resolution points
I am not sure how to proceed from here. Any recommendations?
Here are my rasters:
fr = rast(ncols=108, nrows=203, nlyrs=1, xmin=583400, xmax=594200, ymin=1005700, ymax=1026000, names=c('B10_median'), crs='EPSG:7767')
cr = rast(ncols=23, nrows=43, nlyrs=1, xmin=583280, xmax=593860, ymin=1006020, ymax=1025800, names=c('coarse_image'), crs='EPSG:7767')
The solution came from the #michael answer and the output raster (after cropping and masking with a polygon shp) looks like this:
where the yellow squares are the cells from the coarse raster and the raster underneath it's the output from the code in the answer section.
This is a bit hacky but I think it might do what you want...
# Raster at fine resolution where values are cell indices
fr_cells <- fr
values(fr_cells) <- 1:ncell(fr)
# Second raster at fine resolution where values are indices of
# the surrounding coarse res cell (if there is one)
fr_cr <- fr
fr_xy <- xyFromCell(fr, 1:ncell(fr))
values(fr_cr) <- extract(cr, fr_xy, cells = TRUE)[, "cell"]
# Function to calculate distance given a pair of cell indices
fn <- function(x) {
fr_xy <- xyFromCell(fr, x[1])
cr_xy <- xyFromCell(cr, x[2])
sqrt( sum( (fr_xy - cr_xy)^2 ) )
}
fr_dist <- app(c(fr_cells, fr_cr), fun = fn)
You can use terra::distance for that
Example data
library(terra)
fr <- rast(ncols=108, nrows=203, nlyrs=1, xmin=583400, xmax=594200, ymin=1005700, ymax=1026000, names='B10_median', crs='EPSG:7767')
cr <- rast(ncols=23, nrows=43, nlyrs=1, xmin=583280, xmax=593860, ymin=1006020, ymax=1025800, names='coarse_image', crs='EPSG:7767')
Solution
pts <- as.points(cr, values=FALSE, na.rm=F)
crs(pts) <- crs(cr)
d <- distance(fr, pts)
Illustration
plot(d)
zoom(d, col=gray((1:255)/255))
lines(cr, col="red", lwd=2)
Note that this approach also computes the distance to the center of the nearest cell in cr for cells that are not covered by cr. You could remove those values with
dm <- mask(d, as.polygons(ext(cr)))
dm
#class : SpatRaster
#dimensions : 203, 108, 1 (nrow, ncol, nlyr)
#resolution : 100, 100 (x, y)
#extent : 583400, 594200, 1005700, 1026000 (xmin, xmax, ymin, ymax)
#coord. ref. : WGS 84 / Maharashtra (EPSG:7767)
#source(s) : memory
#name : B10_median
#min value : 0.000
#max value : 311.127
I have a raster, and want to only retain the sea part of the raster, and remove the land part or the raster. If my raster is "ras" and my SpatialpolygonDataFRame is "worldMap", I tried
ras.msk <- rgeos::gDifference(ras,worldMap)
however, I get the following error which I do not understand, but I gather that the function can only be used with two spdf's, not with a raster?
Error in RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid") : rgeos_convert_R2geos: invalid R class RasterLayer, unable to convert.
if I do
r2 <- crop(ras, worldMap)
r3 <- mask(r2, worldMap)
I get the land-part of the raster. How do I get the opposite so that the remaining raster excludes the overlapping spatialpolygondataframe area?
The end result I need is all raster point values at sea to be 1, and the raster point values on land to be 0.
My current code is as follows:
# Make raster layer of study area
ras = raster(ext=extent(-70, -55, -60, -38), res=c(0.01,0.01)) #lat/long xmin, xmax, ymin, ymax #
#give all raster points a "1"
ras[] <- 1
#project the raster
projection(ras) <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
# load land
library(rworldmap)
worldMap <- getMap(resolution = "high")
projection(worldMap) <- CRS(proj4string(ras))
#crop raster by land
ras.msk <- rgeos::gDifference(ras,worldMap)
Need to specify "inverse = T" in the mask function
r2 <- crop(ras, worldMap)
r3 <- mask(r2, worldMap, inverse = T)
I'd like to calculate the centroid of each cluster (in my example I considering points in < 10m distance to the same cluster) of points using mean operation (spdplyr package) for coordinates and another operation (sum) for the attribute(area) without success.
In my example:
#Packages
library(sp)
library(maptools)
library(spdplyr)
library(cluster)
# Small sample (40 points)
small.sample<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/sample_points.csv")
#Convert to spatial object
xy.small.sample <- small.sample[,c(1,2)]
spdf.small.sample <- SpatialPointsDataFrame(coords = xy.small.sample, data = small.sample,
proj4string = CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))
# Transform to UTM
utm.wgs.84 <- "+proj=utm +zone=21 +south +datum=WGS84 +units=m +no_defs"
small.sample.utm <- spTransform(spdf.small.sample, utm.wgs.84)
#
#Convert each cluster to one point
Rc=10 #Maximum distance between points - 10 meters
small.sample.utm.dm<-spDists(small.sample.utm) # Matrix distance
clusters <- as.hclust(agnes(small.sample.utm.dm, diss = T)) #Hierarchical Clustering
small.sample.utm#data$class<- cutree(clusters, h=Rc) # Cut into Groups 4.36 meters
# Average of x and y coordinates and area using spdplyr package
small.sample.utm.classification<- small.sample.utm %>%
group_by (class) %>%
summarise (area_clu=mean(area),x_clu=mean(coords[,1]),y_clu=mean(coords[,2]))
#Erro: Problem with `summarise()` input `x_clu`.
#x objeto 'coords' não encontrado
#i Input `x_clu` is `mean(coords[, 1])`.
#i The error occurred in group 1: class = 1.
My goal is:
# Original points representation
plot(small.sample.utm, pch=16)
# Cluster centroids representation
points(small.sample.utm.classification$x_clu,small.sample.utm.classification$y_clu, col="red")
# Labelling the area
text(small.sample.utm.classification$x_clu ~ small.sample.utm.classification$x_clu, labels=small.sample.utm.classification$area_clu, cex=0.9, font=2)
I have a question regarding conversion of rasters with categorical values from WGS84 to Mollweide projections. It looks like to conversion leads to alteration of the dataset values. Very unfortunately, I struggle to provide you with a reproducible example, so I’ll provide you with some details about my approach. You may have some tips on where my issue could come from, as this may be a common issue. The EU website https://ghsl.jrc.ec.europa.eu/download.php?ds=bu gives me access to the following rasters:
SMOD layer provides me with human settlement info (SMOD dataset which I transformed into 1) a urban mask with “1” for urban areas and “NA” for non urban areas 2) a rural mask with “1” for rural areas and “NA” for non rural areas). SMOD is available in Mollweide projections only.
POP layer provides me with human population density (number of people per grid cell). POP is available in both Mollweide and WGS84.
I tried two approaches to estimate rural and urban human population numbers.
My challenge is that I get different numbers for each of these approaches. I wonder why this is the case:
Approach 1) Change SMOD Mollweide projections to WGS84
# layers as provided by website
SMOD_MollweideProj <- raster ("./GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0.tif")
POP_WGSproj <- raster("./GHS_POP_E2015_GLOBE_R2019A_4326_30ss_V1_0.tif")
SMOD_WGSproj <- projectRaster(from=SMOD_MollweideProj, to= POP_WGSproj, method='ngb' , over=T )
#create rural and urban masks - Classes 30-23-22-21 if aggregated form the "urban domain", 13-12-11-10 form the "rural domain".
SMOD_rur_mask_1K <- SMOD_WGSproj
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) >14] = NA
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) <=13] = 1
SMOD_urb_mask_1K <- SMOD_WGSproj
SMOD_urb_mask_1K[SMOD_urb_mask_1K<20 ] <- NA
SMOD_urb_mask_1K[SMOD_urb_mask_1K>=21 ] <- 1
#Generate rural and urban population layers, based on total population per grid cell and rural and urban masks
POP_rur_1K_WGSproj <- POP_WGSproj * SMOD_rur_mask_1K
POP_urb_1K_WGSproj <- POP_WGSproj * SMOD_urb_mask_1K
#urban and rural population estimates
cellStats(POP_rur_1K_WGSproj, sum, na.rm=T)
#2024108119 which is different to the value I get with the second approach
cellStats(POP_urb_1K_WGSproj, sum, na.rm=T)
#5321638069 which is different to the value I get with the second approach
> SMOD_MollweideProj
class : RasterLayer
dimensions : 18000, 36082, 649476000 (nrow, ncol, ncell)
resolution : 1000, 1000 (x, y)
extent : -18041000, 18041000, -9e+06, 9e+06 (xmin, xmax, ymin, ymax)
crs : +proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs
values : 10, 30 (min, max)
> SMOD_WGSproj
class : RasterLayer
dimensions : 21600, 43200, 933120000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
values : 10, 30 (min, max)
> POP_WGSproj
class : RasterLayer
dimensions : 21600, 43200, 933120000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
values : 0, 459434.6 (min, max)
Approach 2) Work with SMOD and POP Mollweide projections
# layers as provided by website
SMOD_MollweideProj <- raster ("./GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0.tif")
POP_MollweideProj <- raster("./GHS_POP_E2015_GLOBE_R2019A_54009_1K_V1_0.tif")
#create rural and urban masks - Classes 30-23-22-21 if aggregated form the "urban domain", 13-12-11-10 form the "rural domain".
SMOD_rur_mask_1K <- SMOD_MollweideProj
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) >14] = NA
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) <=13] = 1
SMOD_urb_mask_1K <- SMOD_MollweideProj
SMOD_urb_mask_1K[SMOD_urb_mask_1K<20 ] <- NA
SMOD_urb_mask_1K[SMOD_urb_mask_1K>=21 ] <- 1
#Generate rural and urban population layers, based on total population per grid cell and rural and urban masks
POP_rur_1K_MollweideProj <- POP_MollweideProj * SMOD_rur_mask_1K
POP_urb_1K_MollweideProj <- POP_MollweideProj * SMOD_urb_mask_1K
#urban and rural population estimates
cellStats(POP_rur_1K_MollweideProj, sum, na.rm=T)
# 1726372189 which is different to the value I get with the first approach
cellStats(POP_urb_1K_MollweideProj, sum, na.rm=T)
# 5622956252 which is different to the value I get with the first approach
Thank you very much for your suggestions
Following on from the comments above, here's my suggested code for your Approach 2, i.e. working with Mollweide projection for both SMOD and POP. I downloaded data for a single cell rather than the global layer, simply to reduce execution time.
Note I've used raster::mask() to mask POP to urban & rural. With this method there is no need to set a value of 1 in the masks, you can just retain the original values after setting to NA the cells you wish to mask. See ?raster::mask.
library(raster)
smod <- raster("data/GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0_11_4.tif")
pop <- raster("data/GHS_POP_E2015_GLOBE_R2019A_54009_1K_V1_0_11_4.tif")
# produce rural mask
maskRural <- smod
maskRural[maskRural > 14] <- NA
# produce urban mask
maskUrban <- smod
maskUrban[maskUrban < 20] <- NA
# use raster::mask to produce rural and urban population layers
popRural <- raster::mask(pop, maskRural)
popUrban <- raster::mask(pop, maskUrban)
# check that the total population of rural + urban is equal to the original
sum(pop[], na.rm = TRUE)
sum(popUrban[], na.rm = TRUE) + sum(popRural[], na.rm = TRUE)
The final couple of lines just check that the sum of rural and urban populations equals the original total population. So for the single cell I used the results are:
> sum(pop[], na.rm = TRUE)
[1] 67487835
> sum(popUrban[], na.rm = TRUE) + sum(popRural[], na.rm = TRUE)
[1] 67487835
To project to WGS84 you could do something like this:
popUrbanWgs84 <- projectRaster(popUrban, crs = crs("+init=epsg:4326"), method = "bilinear")
You can also specify a res parameter here, otherwise it will choose one for you, but the question would be what resolution to use? The original resolution of 1km is roughly 0.008 degrees but in terms of longitude this varies across the globe.
My suggestion is to stick to Mollweide if at all possible.
Is there a quick way in R to do summary statistics on a raster based on latitudinal intervals or bins. Not a summary of the entire raster layer but spatial subsections. For example, get the mean and sd of raster cell values for every two degrees in latitude.
Below is some example data of a projected raster with Lat/Long coordinates.
set.seed(2013)
library(raster)
r <- raster(xmn=-110, xmx=-90, ymn=40, ymx=60, ncols=40, nrows=40)
r <- setValues(r, rnorm(1600)) #add values to raster
r[r > -0.2 & r < 0.2] <- NA #add some NA's to resemble real dataset
plot(r)
> r
class : RasterLayer
dimensions : 40, 40, 1600 (nrow, ncol, ncell)
resolution : 0.5, 0.5 (x, y)
extent : -110, -90, 40, 60 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer
values : -3.23261, 2.861592 (min, max)
Since your raster's resolution is 0.5 and you have 40 rows, you want the mean / sd for every 4 rows:
set.seed(2013)
library(raster)
r <- raster(xmn=-110, xmx=-90, ymn=40, ymx=60, ncols=40, nrows=40)
r <- setValues(r, rnorm(1600)) #add values to raster
r[r > -0.2 & r < 0.2] <- NA #add some NA's to resemble real dataset
rmean <- sapply(seq(1,nrow(r),4),function(rix) mean(r[rix:rix+3,],na.rm=T))
rsd <- sapply(seq(1,nrow(r),4),function(rix) sd(r[rix:rix+3,],na.rm=T))
# > rmean
# [1] -0.033134373 -0.180689704 0.176575934 -0.003422832 -0.049113312 0.234891614 0.188559162 -0.026514169 0.106970362
# [10] 0.096033677
So you're basically indexing the raster as matrix, only using the slices needed for mean / sd. For iteration you could also use lapply, which puts everything in a neat list.
You can aggregate your rows (groups of 4 in this case) and columns (into one column)
a <- aggregate(r, c(ncol(r), 4), fun=mean)
b <- aggregate(r, c(ncol(r), 4), fun=sd)
lat <- yFromRow(a, 1:nrow(a))
plot(lat, values(a))