Calculate zonal statistics in R as in GIS - r

I have multiple rasters in a folder. I need to extract mean of each of these rasters over a polygon shape file (has more 2500 polygons).
I came across two functions zonal and extract. It says extract can be used for points, lines and polygons too. Is it the only difference ? (Yes/No expected)
How can I extract mean from these multiple rasters and specify different column names as per their filenames for these extracted mean values ?
Edit::
I found a code somewhere and implemented it. But it is taking forever and no progress at all.
grids <- list.files("my_path", pattern = "*.tif$")
#check the number of files in the raster list (grids)
length <- length(grids)
#read-in the polygon shapefile
poly <- readShapePoly("my_path/supplimentY.shp")
#create a raster stack
s <- stack(paste0("my_path/", grids))
#extract raster cell count (sum) within each polygon area (poly)
for (i in 1:length(grids)){
ex <- extract(s, poly, fun='mean', na.rm=TRUE, df=TRUE, weights = TRUE)
# the code doesnot progress from here onwards.
# i checked it by adding this line:: print(i)
}
#write to a data frame
dfr <- data.frame(ex)

You do not need the loop (you repeat the same operation at each iteration!).
It should be like this:
library(raster)
ff <- list.files("my_path", pattern = "\\.tif$", full=TRUE)
s <- stack(ff)
poly <- shapefile("my_path/supplimentY.shp")
ex <- extract(s, poly, fun='mean', na.rm=TRUE, df=TRUE, weights = TRUE)

I used the same code to calculate the zonal mean of climatology for an area boundary, it took me 5-6mins to work on 2736 layers of raster data.
layers <- length(clim)
for (i in 1:length(clim)) {
ex <- extract(clim, shpwb, fun=mean, na.rm=TRUE, df=TRUE)
}
df <- data.frame(ex)
write.csv(df, file = "E:/Central University of Jharkhand/3rd Semester/Climatology/R Studio/CSV.csv")

Related

R: Iteratively crop and mask several rasters with a shapefile

I am having 20 raster layers and 1 shapefile. I would like to:
crop and mask each raster iteratively (i.e., first crop and mask the first raster, then the second and so on)
save each cropped raster with the same name as its original into a subfolder of the working directory.
I can do this for every raster (manually, one-by-one) but I would like to automate the process. My raster layers do not have the same extent, that's why I want to crop and mask them one-by-one (also they are large in size (mb) which means it would be difficult for my laptop to set the same extent and then clip them all together using the shapefile). Here is the code for doing this in a one-by-one manner:
library(terra)
setwd("path")
r1 = rast("path/a.tif")
r2 = rast("path/b.tif")
poly <- vect("path/poly.shp")
r1 <- crop(r1,
poly,
mask = T)
r2 <- crop(r2,
poly,
mask = T)
dir.create(file.path(getwd(), 'clip'), recursive = T)
writeRaster(r1, "path/clip/r1.tif")
writeRaster(r2, "path/clip/r2.tif")
From here you can download a small subset the data.
Perhaps something like this:
library(terra)
crop_poly <- vect(file.path(here::here(), "poly.shp"))
filenames_in <- list.files(here::here(), pattern = "\\.tif$")
corpped_rasters <- filenames_in |>
# apply rast() on each filename in filenames_in list,
# returns list of rasters
lapply(rast) |>
# apply crop(), first param from input list (of rasters),
# pass on y and mask parameters
lapply(crop, y = crop_poly, mask = T)
filenames_out <- file.path("clip", filenames_in)
dir.create(file.path(here::here(), 'clip'))
# apply writeRaster on pairs of list elements from corpped_rasters & filenames_out
# writeRaster(corpped_rasters[[1]], filenames_out[[1]])
# writeRaster(corpped_rasters[[2]], filenames_out[[2]])
# ...
mapply(writeRaster, corpped_rasters, filenames_out)

Convert a column value(s) in SpatialpolygonDataframe into raster image

I need help with converting a variable or column values in a spatial polygon into a raster image. I have spatial data of administrative units with income(mean) information for each unit. I want to convert this information into raster for further analysis.
I tried the code below but it didn't work.
r <- raster(ncol=5,nrow=15)
r.inc <- rasterize(DK,r,field=DK#data[,2],fun=mean)
Where SP is the spatial polygon and the mean income for each spatial unit stored in column 2 of the SpatialPolygonDataframe. Can anyone help with a function or code of how to rasterise the values in the column of interest? An example of the spatialpolygondataframe (created) and my attempt to rasterize the data are below
suppressPackageStartupMessages(library(tidyverse))
url = "https://api.dataforsyningen.dk/landsdele?format=geojson"
geofile = tempfile()
download.file(url, geofile)
DK <- rgdal::readOGR(geofile)
DK#data = subset(DK#data, select = c(navn))
DK#data$inc = runif(11, min=5000, max=80000)
require(raster)
r <- raster(ncol=5,nrow=15)
r.inc <- rasterize(DK,r,field=DK#data[,2],fun=mean)
plot(r.inc)
Thank you.
Acknowledgement: The code for creating the sample SPDF was sourced from Mikkel Freltoft Krogsholm (link below).
https://www.linkedin.com/pulse/easy-maps-denmark-r-mikkel-freltoft-krogsholm/?trk=read_related_article-card_title
Here's something that makes a raster.
library(tidyverse)
library(rgdal)
library(raster)
url <- "https://api.dataforsyningen.dk/landsdele?format=geojson"
geofile <- tempfile()
download.file(url, geofile)
DK <- rgdal::readOGR(geofile)
r_dk <- raster(DK, nrows = 100, ncols = 100) # Make a raster of the same size as the spatial polygon with many cells
DK$inc <- runif(nrow(DK), min=5000, max=80000) # Add some fake income data
rr <- rasterize(DK, r_dk, field='inc') # Rasterize the polygon into the raster - fun = 'mean' won't make any difference
plot(rr)
The original raster was the size of the whole Earth so I think Denmark was being averaged to nothing. I resolved this by making an empty raster based on the extent of the DK spatial polygons with 100x100 cells. I also simplified the code. Generally, if you find yourself using # with spatial data manipulation, it's a sign that there might be a simpler way. Because the resolution of the raster is much larger than the size of each DK region, taking the average doesn't make much difference.

Extract Raster Pixels Values Using Vector Polygons in R

I have been struggling with this for hours.
I have a shapefile (called "shp") containing 177 polygons i.e. 177 counties. This shapefile is overlaid on a raster. My raster (called "ras") is made of pixels having different pollution values.
Now I would like to extract all pixel values and their number of occurrences for each polygon.
This is exactly what the QGIS function "zonal histogram" is doing. But I would like to do the exact same thing in R.
I tried the extract() function and I managed to get a mean value per county, which is already a first step, but I would like to make a pixels distribution (histogram).
Could someone give me a hand ?
Many thanks,
Marie-Laure
Thanks a lot for your help. Next time I promise I will be careful and explain my issue more in details.
With your help I managed to find a solution.
I also used this website : http://zevross.com/blog/2015/03/30/map-and-analyze-raster-data-in-r/
For information, first I had to uninstall the "tidyr" package because there was a conflict with the extract function.
In case it can help someone, here is the final code :
# Libraries loading
library(raster)
library(rgdal)
library(sp)
# raster layer import
ras=raster("C:/*.tif")
# shapefile layer import
shp<-shapefile("C:/*.shp")
# Extract the values of the pixels raster per county
ext <- extract(ras, shp, method='simple')
# Function to tabulate pixel values by region & return a data frame
tabFunc <- function(indx, extracted, region, regname) {
dat <- as.data.frame(table(extracted[[indx]]))
dat$name <- region[[regname]][[indx]]
return(dat)
}
# run through each county & compute a table of the number
# of raster cells by pixel value. ("CODE" is the county code)
tabs <- lapply(seq(ext), tabFunc, ext, shp, "CODE")
# assemble into one data frame
df <- do.call(rbind, tabs)
# to see the data frame in R
print(df)
# table export
write.csv(df,"C:/*.csv", row.names = FALSE)
Here is a minimal, self-contained, reproducible example (almost literally from ?raster::extract, so not difficult to make)
library(raster)
r <- raster(ncol=36, nrow=18, vals=rep(1:9, 72))
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys <- spPolygons(cds1, cds2)
Now you can do
v <- extract(r, polys)
par(mfrow=c(1,2))
z <- lapply(v, hist)
Or more fancy
mains <- c("first", "second")
par(mfrow=c(1,2))
z <- lapply(1:length(v), function(i) hist(v[[i]], main=mains[i]))
Or do you want a barplot
z <- lapply(1:length(v), function(i) barplot(table(v[[i]]), main=mains[i]))

Line density function in R equivalent to Line density tool in ArcMap (arcpy)

I need to calculate the magnitude-per-unit area of polylines that fall within a radius around each cell. Essentially I need to calculate a km/km2 road density within a 500m pixel search radius. ArcMap has a quick and easy tool that handles this, but I need a pure R solution.
Here is a link on how line density works: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/how-line-density-works.htm
And this is how to use it in a python (arcpy) script: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/line-density.htm
I currently execute a backwards approach using raster::focal function, calculating a density of burned in road features. I then convert the km2/km2 output to km/km2.
#Import libraries
library(raster)
library(rgdal)
library(gdalUtils)
#Read-in an already created raster mask (cells are all set to 0)
mask <- raster("x://path to raster mask...")
#Make a copy of the mask to burn features in, keeping the original untouched
roads_mask <- file.copy(mask, "x://output path ...//roads.tif")
#Read-in road features (shapefile format)
roads_sldf <- readOGR("x://path to shapefile" , "roads")
#Rasterize spatial lines data frame ie. burn road features into mask
#Where road features get a value of 1, mask extent gets a value of 0
roads_raster <- gdalUtils::gdal_rasterize(src_datasource = roads_sldf,
dst_filename = "x://output path ...//roads.tif", b = 1,
burn = 1, l = "roads", output_Raster = TRUE)
#Run a 1km circular radius density function (be mindful of edge effects)
weight <- raster::focalWeight(roads_raster,1000,type = "circle")
1km_rdDensity <- raster::focal(roads_raster, weight, fun=sum, filename = '',
na.rm=TRUE, pad=TRUE, NAonly=FALSE, overwrite=TRUE)
#Convert km2/km2 road density to km/km2
#Set up the moving window
weight <- raster::focalWeight(roads_raster,1000,type = "circle")
#Count how many records in each column of the moving window are > 0
columnCount <- apply(weight,2,function(x) sum(x > 0))
#Get the sum of the column count
number_of_cells <- sum(columnCount)
#multiply km2/km2 density by number of cells in the moving window
step1 <- roads_raster * number_of_cells
#Rescale step1 output with respect to cell size(30m) and radius of a circle
final_rdDensity <- (step1*0.03)/3.14159265
#Write out final km/km2 road density raster
writeRaster(final_rdDensity,"X://path to output...", datatype = 'FLT4S', overwrite = TRUE)
After some more research I think I may be able to use a kernel function, however I don't want to apply the smoothing algorithm... As well the output is an 'im' object which I would need to write to as a 'tif'
#Import libraries
library(spatstat)
library(rgdal)
#Read-in road features (shapefile format)
roads_sldf <- readOGR("x://path to shapefile" , "roads")
#Convert roads spatial lines data frame to psp object
psp_roads <- as.psp(roads_sldf)
#Apply kernel density, however this is where I am unsure of the arguments
road_density <- spatstat::density.psp(psp_roads, sigma = 0.01, eps = 500)
Cheers.
See this question https://gis.stackexchange.com/questions/138861/calculating-road-density-in-r-using-kernel-density
Tried to mark as a duplicate but doesn't work because the other Q is on gis stack exchange
Short answer is use spatstat.geom::pixellate()
I also needed spatstat.geom::as.psp(sf::st_geometry(x)) to convert an sf lines object to the correct format and maptools::as.im.RasterLayer(r) to convert a raster. I was able to convert the result to RasterLayer with raster::raster(pix_res)
Perhaps you can use terra::rasterizeGeom which is available in the development version that you can install with install.packages('terra', repos='https://rspatial.r-universe.dev')
Example data
library(terra)
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f) |> as.lines()
r <- rast(v, res=.1)
Solution
x <- rasterizeGeom(v, r, fun="length", "km")
And then use focal sum, but you would not have a perfect circle.
What you could do instead, if your dataset is not too large, is create a circle for each grid cell and use intersect. Something like this:
p <- xyFromCell(r, 1:ncell(r)) |> vect(crs="+proj=longlat")
p$id <- 1:ncell(r)
b <- buffer(p, 10000)
values(v) <- NULL
i <- intersect(v, b)
x <- aggregate(perim(i), list(id=i$id), sum)
r[x$id] <- x[,2]

Extract shapefiles from longitude/latitude gridded data

I have some gridded data of sea surface temperature values in the Mediterranean to which I've applied clustering. I have 420 files with three columns structure (long,lat,value). The data for a particular file looks like this map
Now I want to extract the cluster areas as shapefile for postprocessing. I have found this post (https://gis.stackexchange.com/a/187800/9227) and tried to use its code like this
# Packages
library(sp)
library(rgdal)
library(raster)
# Paths
ruta_datos<-"/home/meteo/PROJECTES/VERSUS/OUTPUT/DATA/CLUSTER_MED/"
setwd("~/PROJECTES/VERSUS/temp")
# File list
files <- list.files(path = ruta_datos, pattern = "SST-cluster-mitja-mensual")
for (i in 1:length(files)){
datos<-read.csv(paste0(ruta_datos,files[i],sep=""),header=TRUE)
nclusters<-max(datos$cluster)
for (j in 1:nclusters){
clust.dat<-subset(datos, cluster == j)
coordinates(clust.dat)=~longitud+latitud
proj4string(clust.dat)=CRS("+init=epsg:4326")
pts = spTransform(clust.dat,CRS("+init=epsg:4326"))
gridded(pts) = TRUE
r = raster(pts)
projection(r) = CRS("+init=epsg:4326")
# make all values the same. Either do
s <- r > -Inf
# convert to polygons
pp <- rasterToPolygons(s, dissolve=TRUE)
# save shapefile
shname<-paste("SST-shape-",substr(files[i],27,32),"-",j,sep="")
writeOGR(pp, dsn = '.', layer = shname, driver = "ESRI Shapefile")
}
}
But the code stops for with this error message
gridded(pts) = TRUE
suggested tolerance minimum: 1
Error in points2grid(points, tolerance, round) : dimension 2
: coordinate intervals are not constant
Warning message: In points2grid(points, tolerance, round) : grid has empty
column/rows in dimension 1
I don't understand that at a certain file it says that coordinate intervals are not constant while they indeed are, original SST data from which clustering was derived are on a regular grid over the whole globe. All cluster data files have the same size, 4248 points. A sample data file is available here
What does the tolerance suggestion means? I've been looking for a solution and found some suggestion to use SpatialPixelsDataFrame but couldn't find out how to apply.
Any help would be appreciated. Thanks.
I am not an expert of geospatial data but for me, if you filter on cluster, data are indeed not on a grid. So far as I understand, you start from a grid (convex set of regularly distant points).
I tried following modifications to your code and some files are generated but I can't test whether they are correct or not.
Principle is to build the grid on all data then only filter on cluster before calling raster.
This gives:
files <- list.files(path = ruta_datos, pattern = "SST-cluster-mitja-mensual")
for (i in 1:length(files)){
datos<-read.csv(paste0(ruta_datos,files[i],sep=""),header=TRUE)
nclusters<-max(datos$cluster)
for (j in 1:nclusters){
## clust.dat<-subset(datos, cluster == j)
clust.dat <- datos
coordinates(clust.dat)=~longitud+latitud
proj4string(clust.dat)=CRS("+init=epsg:4326")
pts = spTransform(clust.dat,CRS("+init=epsg:4326"))
gridded(pts) = TRUE
## r = raster(pts)
r= raster(pts[pts$cluster==j,])
projection(r) = CRS("+init=epsg:4326")
# make all values the same. Either do
s <- r > -Inf
# convert to polygons
pp <- rasterToPolygons(s, dissolve=TRUE)
# save shapefile
shname<-paste("SST-shape-",substr(files[i],27,32),"-",j,sep="")
writeOGR(pp, dsn = '.', layer = shname, driver = "ESRI Shapefile")
}
}
So, two lines in comment and update just the line below.

Resources