Extract Raster Pixels Values Using Vector Polygons in R - r

I have been struggling with this for hours.
I have a shapefile (called "shp") containing 177 polygons i.e. 177 counties. This shapefile is overlaid on a raster. My raster (called "ras") is made of pixels having different pollution values.
Now I would like to extract all pixel values and their number of occurrences for each polygon.
This is exactly what the QGIS function "zonal histogram" is doing. But I would like to do the exact same thing in R.
I tried the extract() function and I managed to get a mean value per county, which is already a first step, but I would like to make a pixels distribution (histogram).
Could someone give me a hand ?
Many thanks,
Marie-Laure

Thanks a lot for your help. Next time I promise I will be careful and explain my issue more in details.
With your help I managed to find a solution.
I also used this website : http://zevross.com/blog/2015/03/30/map-and-analyze-raster-data-in-r/
For information, first I had to uninstall the "tidyr" package because there was a conflict with the extract function.
In case it can help someone, here is the final code :
# Libraries loading
library(raster)
library(rgdal)
library(sp)
# raster layer import
ras=raster("C:/*.tif")
# shapefile layer import
shp<-shapefile("C:/*.shp")
# Extract the values of the pixels raster per county
ext <- extract(ras, shp, method='simple')
# Function to tabulate pixel values by region & return a data frame
tabFunc <- function(indx, extracted, region, regname) {
dat <- as.data.frame(table(extracted[[indx]]))
dat$name <- region[[regname]][[indx]]
return(dat)
}
# run through each county & compute a table of the number
# of raster cells by pixel value. ("CODE" is the county code)
tabs <- lapply(seq(ext), tabFunc, ext, shp, "CODE")
# assemble into one data frame
df <- do.call(rbind, tabs)
# to see the data frame in R
print(df)
# table export
write.csv(df,"C:/*.csv", row.names = FALSE)

Here is a minimal, self-contained, reproducible example (almost literally from ?raster::extract, so not difficult to make)
library(raster)
r <- raster(ncol=36, nrow=18, vals=rep(1:9, 72))
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys <- spPolygons(cds1, cds2)
Now you can do
v <- extract(r, polys)
par(mfrow=c(1,2))
z <- lapply(v, hist)
Or more fancy
mains <- c("first", "second")
par(mfrow=c(1,2))
z <- lapply(1:length(v), function(i) hist(v[[i]], main=mains[i]))
Or do you want a barplot
z <- lapply(1:length(v), function(i) barplot(table(v[[i]]), main=mains[i]))

Related

calculating road density raster from road shapefile

I'm looking to turn a shapefile with roads (which includes a column of length per road) in the Eastern half of the USA into a raster of 1x1km of road density, using R.
I can't find a straightforward way in Arcmap (Line density works with a radius from the cell center instead of just the cell).
Here is a solution that creates polygons from the raster cells (adapted from my answer here). You may need to to this for subsets of your dataset and then combine.
Example data
library(terra)
v <- vect(system.file("ex/lux.shp", package="terra"))
roads <- as.lines(v)
rs <- rast(v)
Solution
values(rs) <- 1:ncell(rs)
names(rs) <- "rast"
rsp <- as.polygons(rs)
rp <- intersect(roads, rsp)
rp$length <- perim(rp) / 1000 #km
x <- tapply(rp$length, rp$rast, sum)
r <- rast(rs)
r[as.integer(names(x))] <- as.vector(x)
plot(r)
lines(roads)

Convert a column value(s) in SpatialpolygonDataframe into raster image

I need help with converting a variable or column values in a spatial polygon into a raster image. I have spatial data of administrative units with income(mean) information for each unit. I want to convert this information into raster for further analysis.
I tried the code below but it didn't work.
r <- raster(ncol=5,nrow=15)
r.inc <- rasterize(DK,r,field=DK#data[,2],fun=mean)
Where SP is the spatial polygon and the mean income for each spatial unit stored in column 2 of the SpatialPolygonDataframe. Can anyone help with a function or code of how to rasterise the values in the column of interest? An example of the spatialpolygondataframe (created) and my attempt to rasterize the data are below
suppressPackageStartupMessages(library(tidyverse))
url = "https://api.dataforsyningen.dk/landsdele?format=geojson"
geofile = tempfile()
download.file(url, geofile)
DK <- rgdal::readOGR(geofile)
DK#data = subset(DK#data, select = c(navn))
DK#data$inc = runif(11, min=5000, max=80000)
require(raster)
r <- raster(ncol=5,nrow=15)
r.inc <- rasterize(DK,r,field=DK#data[,2],fun=mean)
plot(r.inc)
Thank you.
Acknowledgement: The code for creating the sample SPDF was sourced from Mikkel Freltoft Krogsholm (link below).
https://www.linkedin.com/pulse/easy-maps-denmark-r-mikkel-freltoft-krogsholm/?trk=read_related_article-card_title
Here's something that makes a raster.
library(tidyverse)
library(rgdal)
library(raster)
url <- "https://api.dataforsyningen.dk/landsdele?format=geojson"
geofile <- tempfile()
download.file(url, geofile)
DK <- rgdal::readOGR(geofile)
r_dk <- raster(DK, nrows = 100, ncols = 100) # Make a raster of the same size as the spatial polygon with many cells
DK$inc <- runif(nrow(DK), min=5000, max=80000) # Add some fake income data
rr <- rasterize(DK, r_dk, field='inc') # Rasterize the polygon into the raster - fun = 'mean' won't make any difference
plot(rr)
The original raster was the size of the whole Earth so I think Denmark was being averaged to nothing. I resolved this by making an empty raster based on the extent of the DK spatial polygons with 100x100 cells. I also simplified the code. Generally, if you find yourself using # with spatial data manipulation, it's a sign that there might be a simpler way. Because the resolution of the raster is much larger than the size of each DK region, taking the average doesn't make much difference.

Population density within polygons

So, I have some questions regarding the raster package in R. I have a raster with estimated population in each grid point. I also have a shapefile with polygons of regions. I want to find out the coordinates of the neighborhood with the highest population density within each regions. Supose that each neighborhood is a homogeneous square of 5 by 5 grid points.
The following toy example mimics my problem.
library(raster)
library(maptools)
set.seed(123)
data(wrld_simpl)
wrld_simpl <- st_as_sf(wrld_simpl)
contr_c_am <- wrld_simpl %>%
filter(SUBREGION ==13) %>%
filter(FIPS != "MX") %>%
select(NAME)
# Create a raster of population (sorry for the bad example spatial distribution)
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=0.1)
values(r) <- runif(ncell(r), 0, 100)
# keep only raster around the region of interest
r_small <- crop(r, extent(contr_c_am))
plot(r_small)
plot(st_geometry(contr_c_am), add = T)
raster_contr_c_am <- rasterize(contr_c_am, r)
raster_contr_c_am is the population grid and the name of the region is saved as an attribute.
Somehow I need to filter only grid points from one region, and probably use some funcion like focal() to find total nearby population.
focal(raster_contr_c_am, matrix(1,5,5),sum, pad = T, padValue = 0)
Then, I need to find which grid point has the highest value within each region, and save it's coordinates.
I hope my explanation is not too confusing,
Thanks for any help!
Here's an example that iterates over the shape defining the region, then uses the raster values within the region and the focal() function to find the maximum.
library(raster)
library(maptools)
library(sf)
library(dplyr)
set.seed(123)
data(wrld_simpl)
wrld_simpl <- st_as_sf(wrld_simpl)
contr_c_am <- wrld_simpl %>%
filter(SUBREGION ==13) %>%
filter(FIPS != "MX") %>%
select(NAME)
# Create a raster of population (sorry for the bad example spatial distribution)
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=0.1)
values(r) <- runif(ncell(r), 0, 100)
# keep only raster around the region of interest
r_small <- crop(r, extent(contr_c_am))
raster_contr_c_am <- rasterize(contr_c_am, r_small)
# function to find the max raster value using focal
# in a region
findMax <- function(region, raster) {
tt <- trim((mask(raster, region))) # focus on the region
ff <- focal(tt, w=matrix(1/25,nc=5,nr=5))
maximumCell <- which.max(ff) # find the maximum cell id
maximumvalue <- maxValue(ff) # find the maximum value
maximumx <- xFromCell(ff, maximumCell) # get the coordinates
maximumy <- yFromCell(ff, maximumCell)
# return a data frame
df <- data.frame(maximumx, maximumy, maximumvalue)
df
}
numberOfShapes <- nrow(contr_c_am)
ll <- lapply(1:numberOfShapes, function(s) findMax(region = contr_c_am[s,], raster = r_small))
merged <- do.call(rbind, ll)
maxpoints <- st_as_sf(merged, coords=c('maximumx', 'maximumy'), crs=crs(contr_c_am))
library(mapview) # optional but nice visualization - select layers to see if things look right
mapview(maxpoints) + mapview(r_small) + mapview(contr_c_am)
I've made an sf object so that it can be plotted with the other spatial objects. Using the mapview package, I get this.

Zonal statistics to get majority pixel value per polygon in R?

Actually I try to calculate the major pixel values from a raster with a SpatialPolygonsDataFrame. Here is some code I found which might lead in the right direction:
library(raster)
# Create interger class raster
r <- raster(ncol=36, nrow=18)
r[] <- round(runif(ncell(r),1,10),digits=0)
r[]<-as.integer(r[])
# Create two polygons
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys <- SpatialPolygonsDataFrame(SpatialPolygons(list(Polygons(list(Polygon(cds1)), 1),
Polygons(list(Polygon(cds2)),2))),data.frame(ID=c(1,2)))
# Extract raster values to polygons
( v <- extract(r, polys) )
# Get class counts for each polygon
v.counts <-lapply(v,table)
So far everything is fine but I´m really stuck to extract the column name of the column which has the highest counts.
I tried things like:
v.max<- lapply(v.counts,max)
But there the column information gets lost. After:
v.max<- lapply(v.counts, max.col)
I get just "1" as result.
I´d appreciate if somebody can give me a hint what I´m doing wrong. Is there also another way to extract the major pixel values in a polygon?
which.max() is your friend. Since you just want the names, use names().
sapply(v.counts, function(x) names(x)[which.max(x)])
# [1] "9" "5"
Note: set.seed(42)
exactextractr package can do this trick. It computes zonal statistics even faster than terra in some cases. See comparison here
library(exactextractr)
exact_extract(r, polys, 'majority')
#> Warning in .exact_extract(x, sf::st_as_sf(y), ...): No CRS specified for
#> polygons; assuming they have the same CRS as the raster.
#> |======================================================================| 100%
#> [1] 4 2
You can use the modal function
v <- extract(r, polys, modal)

Line density function in R equivalent to Line density tool in ArcMap (arcpy)

I need to calculate the magnitude-per-unit area of polylines that fall within a radius around each cell. Essentially I need to calculate a km/km2 road density within a 500m pixel search radius. ArcMap has a quick and easy tool that handles this, but I need a pure R solution.
Here is a link on how line density works: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/how-line-density-works.htm
And this is how to use it in a python (arcpy) script: http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/line-density.htm
I currently execute a backwards approach using raster::focal function, calculating a density of burned in road features. I then convert the km2/km2 output to km/km2.
#Import libraries
library(raster)
library(rgdal)
library(gdalUtils)
#Read-in an already created raster mask (cells are all set to 0)
mask <- raster("x://path to raster mask...")
#Make a copy of the mask to burn features in, keeping the original untouched
roads_mask <- file.copy(mask, "x://output path ...//roads.tif")
#Read-in road features (shapefile format)
roads_sldf <- readOGR("x://path to shapefile" , "roads")
#Rasterize spatial lines data frame ie. burn road features into mask
#Where road features get a value of 1, mask extent gets a value of 0
roads_raster <- gdalUtils::gdal_rasterize(src_datasource = roads_sldf,
dst_filename = "x://output path ...//roads.tif", b = 1,
burn = 1, l = "roads", output_Raster = TRUE)
#Run a 1km circular radius density function (be mindful of edge effects)
weight <- raster::focalWeight(roads_raster,1000,type = "circle")
1km_rdDensity <- raster::focal(roads_raster, weight, fun=sum, filename = '',
na.rm=TRUE, pad=TRUE, NAonly=FALSE, overwrite=TRUE)
#Convert km2/km2 road density to km/km2
#Set up the moving window
weight <- raster::focalWeight(roads_raster,1000,type = "circle")
#Count how many records in each column of the moving window are > 0
columnCount <- apply(weight,2,function(x) sum(x > 0))
#Get the sum of the column count
number_of_cells <- sum(columnCount)
#multiply km2/km2 density by number of cells in the moving window
step1 <- roads_raster * number_of_cells
#Rescale step1 output with respect to cell size(30m) and radius of a circle
final_rdDensity <- (step1*0.03)/3.14159265
#Write out final km/km2 road density raster
writeRaster(final_rdDensity,"X://path to output...", datatype = 'FLT4S', overwrite = TRUE)
After some more research I think I may be able to use a kernel function, however I don't want to apply the smoothing algorithm... As well the output is an 'im' object which I would need to write to as a 'tif'
#Import libraries
library(spatstat)
library(rgdal)
#Read-in road features (shapefile format)
roads_sldf <- readOGR("x://path to shapefile" , "roads")
#Convert roads spatial lines data frame to psp object
psp_roads <- as.psp(roads_sldf)
#Apply kernel density, however this is where I am unsure of the arguments
road_density <- spatstat::density.psp(psp_roads, sigma = 0.01, eps = 500)
Cheers.
See this question https://gis.stackexchange.com/questions/138861/calculating-road-density-in-r-using-kernel-density
Tried to mark as a duplicate but doesn't work because the other Q is on gis stack exchange
Short answer is use spatstat.geom::pixellate()
I also needed spatstat.geom::as.psp(sf::st_geometry(x)) to convert an sf lines object to the correct format and maptools::as.im.RasterLayer(r) to convert a raster. I was able to convert the result to RasterLayer with raster::raster(pix_res)
Perhaps you can use terra::rasterizeGeom which is available in the development version that you can install with install.packages('terra', repos='https://rspatial.r-universe.dev')
Example data
library(terra)
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f) |> as.lines()
r <- rast(v, res=.1)
Solution
x <- rasterizeGeom(v, r, fun="length", "km")
And then use focal sum, but you would not have a perfect circle.
What you could do instead, if your dataset is not too large, is create a circle for each grid cell and use intersect. Something like this:
p <- xyFromCell(r, 1:ncell(r)) |> vect(crs="+proj=longlat")
p$id <- 1:ncell(r)
b <- buffer(p, 10000)
values(v) <- NULL
i <- intersect(v, b)
x <- aggregate(perim(i), list(id=i$id), sum)
r[x$id] <- x[,2]

Resources