I would like to know if there is way to create zonal statistics for RasterLayerObjects, specifically the count of a given cell value (e.g. a land-use class) in R without having to reclassify the whole raster. The solution should be memory efficient in order to work on large raster files i.e. no extraction of the values into a matrix in R is desired.
Below an example of how I handle it until now. In this case I reclassify the original raster to hold only 1 for the value of interest and missings for all other values.
My proposed solution creates both, redundant data and additional processing steps to get me to my initial goal. I thought something like zonal(r1[r1==6],r2,"count") would work but obviously it does not (see below).
# generate reproducible Raster
library("raster")
## RASTER 1 (e.g. land-use classes)
r1 <- raster( crs="+proj=utm +zone=31")
extent(r1) <- extent(0, 100, 0, 100)
res(r1) <- c(5, 5)
values(r1) <- sample(10, ncell(r1), replace=TRUE)
plot(r1)
## RASTER 2 (containing zones of interest)
r2 <- raster( crs="+proj=utm +zone=31")
extent(r2) <- extent(0, 100, 0, 100)
res(r2) <- c(5, 5)
values(r2) <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100))
plot(r2)
# (1) ZONAL STATISTICS
# a. how many cells per zone (independent of specific cell value)
zonal(r1,r2,"count")
# b. how many cells per zone of specific value 6
zonal(r1[r1==6],r2,"count")
# -> fails
# with reclassification
r1.reclass<-
reclassify(r1,
matrix(c(1,5,NA,
5.5,6.5,1, #class of interest
6.5,10,NA),
ncol=3,
byrow = T),
include.lowest=T # include the lowest value from the table.
)
zonal(r1.reclass,r2,"count")
you can use raster::match.
zonal(match(r1, 6),r2, "count")
As you can see from plot(match(r1, 6)), it only returns raster cells which hold the desired value(s). All other cells are NA.
r1==6 as used in your try unfortunately returns a vector and therefore cannot be used in focal anymore.
Related
I am trying to calculate the agreement of two different cropland maps at zonal/ grid level.
I first created the rasters that show cropland agreement between a classified raster and validation data that was created.
agreement<- sample_raster==raster_sampled
crop_agreement <- sample_raster==1 & raster_sampled==1
I then prepared the tiles for zonal analysis by:
tiles_poly$id<- as.double(tiles_poly$id)
tiles_poly$zone <- tiles_poly$id
tiles<-fasterize(st_as_sf(tiles_poly),raster_aoi, field="id")
This works fine.
However, when I'm trying to calculate the agreement at zonal level, I get an error.
zonal_crop_agreement<- zonal(crop_agreement,tiles,fun='sum', na.rm=TRUE)
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
As far as I understand, the error arises because the number of columns is not equal between the two rasters. Yet, when I print the length of tiles & crop_agreement, the have the same length.
I think therefore, that the error might be due to the "fun = 'sum'". I also do not exactly understand what that function does.
Does someone have an idea what the problem could be?
The complete code is here:
## INPUT DATA GOES HERE ##
test_name<- "220624_cropland" # use this to name the test. The exported files will have this name
tiles_poly<-readOGR("C:/data_cropland_burkina/Grid_50km.shp") #the tile grid
raster_aoi <- raster("C:/data_cropland_burkina/empty_raster.tif") #sets the resolution and area of interest
k_rast<-raster("C:data_cropland_burkina/Cropland_Burkina_Copernicus.tif") # this is the classified raster I am analyzing
#k_rast <- rast_class
## This is to convert points into raster ##
sample_points<-readOGR("C:/Users/louis/OneDrive/Desktop/data_cropland_burkina/validation_points_poly0.shp", stringsAsFactors = FALSE) #the validation points
sample_points$class_2 <- as.numeric(sample_points$item) # make sure the class column is numeric
sample_raster<- fasterize(st_as_sf(sample_points), raster_aoi, field="item")
## crop the raster (makes the script run faster) ##
#aoi<- crop(raster_aoi,k_rast)
plot(k_rast)
#plot(sample_points, add=TRUE) ##maybe remove this by commenting out
crop_aoi <- function(x) {crop(x,raster_aoi)}
k_rast <- crop_aoi(k_rast)
sample_raster<-crop_aoi(sample_raster)
## ONLY USE THIS IF THE RASTER DOES NOT HAVE THE SAME RESOLUTION AS THE RASTER_AOI ##
#resample all rasters
resize_aoi<- function(x){raster::resample(x,raster_aoi,method="ngb")}
k_rast<-resize_aoi(k_rast)
##Reclassify if the raster is binary. otherwise comment out.
m <- c(0, 2, 1, 1)
rclmat <- matrix(m, ncol=2, byrow=TRUE)
k_rast <- reclassify(k_rast, rclmat)
## get the unique values from each raster ##
k_rast_vals<-sort(unique(na.omit(getValues(k_rast))))
sample_vals<-unique(na.omit(getValues(sample_raster)))
## extract the crop and non-crop pixels from the sample ##
sample_crop<- sample_raster==1
sample_non_crop<- sample_raster==2
sample_raster_binary<-sample_raster>0
## reduce the classified raster to only the pixels covered by the sample ##
raster_sampled<- k_rast*sample_raster_binary
## Calculate the confusion matrix ##
crop_comp <- function(x) cellStats(raster_sampled==x * sample_crop, stat='sum')
non_crop_comp <- function(x)cellStats(raster_sampled==x * sample_non_crop, stat='sum')
results_crop <- rbind(sapply(k_rast_vals,crop_comp))
colnames(results_crop)<- k_rast_vals
results_non_crop<- rbind(sapply(k_rast_vals,non_crop_comp))
colnames(results_crop)<- k_rast_vals
results<- rbind(results_crop, results_non_crop)
rownames(results) <- c("crop_ref", "non_crop_ref")
results<-cbind(results, total = rowSums(results))
results<-rbind(results, total = colSums(results))
results
## ------SECTION 2- Zonal Analysis - show agreement at the grid level------- ##
## Create rasters that show cropland agreement between the classified raster and validation data ##
agreement<- sample_raster==raster_sampled
crop_agreement <- sample_raster==1 & raster_sampled==1
## prepare the tiles for zonal analysis ##
tiles_poly$id<- as.double(tiles_poly$id)
tiles_poly$zone <- tiles_poly$id
tiles<-fasterize(st_as_sf(tiles_poly),raster_aoi, field="id")
## calculate agreement at the zonal/grid level (thus, the statistics on cell alues of the raster within the ones defined by the other dataset - validation data & input crop maps)##
## next line calculates the zonal crop agreement, taking the crop agreement raster created as well as the prepared tiles as input
## na.rm removed missing values from the data if they are indicated as NA.
## the function fun = 'sum' is
zonal_crop_agreement<- zonal(crop_agreement,tiles,FUN='sum', na.rm=TRUE)
zonal_crop_ref <- zonal(sample_crop,tiles,fun= 'sum', na.rm=TRUE)
zonal_crop_agreement<- data.frame(cbind(zonal_crop_agreement, zonal_crop_ref[,2]))
colnames(zonal_crop_agreement)[3]<- "crop_ref"
zonal_crop_agreement<- data.frame(cbind(zonal_crop_agreement,zonal(raster_sampled==1,tiles,fun="sum",na.rm=TRUE)[,2]))
colnames(zonal_crop_agreement)[4]<- "crop_class"
zonal_crop_agreement<- data.frame(cbind(zonal_crop_agreement,zonal(sample_raster,tiles,fun="count",na.rm=TRUE)[,2]))
colnames(zonal_crop_agreement)[5]<- "samp_pxls"
I'm trying to calculate the majority value in a categorical raster data set in R, for example using land cover data. This would be similar to the focal statistics tool in ArcGIS using the majority statistic. I am able to calculate the majority land cover type using a rectangular moving window and the modal function:
library(raster)
# create data
r <- raster(nrows = 120, ncol = 120, xmn=0)
r[] <- sample(3, ncell(r), replace=TRUE)
a<-focal(r, w=matrix(1,3,3), fun=modal) # 3x3 moving window
plot(a)
However, when I apply a focal weight to define my circular moving window, the output values appear to be treated as continuous numbers and not discrete values, even if I convert the raster to a factor:
#convert to a factor factor
r.f<-as.factor(r)
#set up window
fw <- focalWeight(r.f, 4.5, type='circle')
#apply focal fxn
r.f.focal<-focal(r.f, w=fw, fun=modal, na.rm=TRUE)
It appears zeros might be added into the analysis and that is creating the problem. Can anyone steer me in the right direction?
I think it will be fixed if d (which is now 4.5) is an integer. Your defined radius should be based on cell counts (1,2,3,...,n). However, still, the output will be in floating format as each xij in the kernel is a floating point with sum of 1. To achieve an integer output there is also a third option.
library(raster)
set.seed(070319)
# create data
r <- raster(nrows = 120, ncol = 120, xmn=0)
r[] <- sample(3, ncell(r), replace=TRUE)
a<-focal(r, w=matrix(1,3,3), fun=modal) # 3x3 moving window
par(mfrow=c(1,2))
plot(r)
plot(a)
#set up window
fw <- focalWeight(r, 4, type='circle')
#apply focal fxn
r.f.focal<-focal(r, w=fw, fun=modal, na.rm=TRUE)
par(mfrow=c(1,2))
plot(r)
plot(r.f.focal)
for integer output you can also do this:
#set up window
fw <- ceiling(focalWeight(r, 4, type='circle'))#for integer output
#apply focal fxn
r.f.focal<-focal(r, w=fw, fun=modal, na.rm=TRUE)
par(mfrow=c(1,2), oma=c(0,0,0,1))
plot(r)
plot(r.f.focal)
apologies if this has been asked before but I could not find a solution to my problem, which I believe is a rather simple problem. I have a single source raster dataset containing continuous floating values ranging between 100 and 500. What I would like to do is loop through this source raster in increments of 50 to export/create new raster datasets of all values that are lower than the increment. For example, I have the following R code (using the raster library) to specify the raster and identify the increments. I would to develop a way to automatically create 9 output raster datasets that are less than or equal to the values of each increment. I can't seem to get there. Can anyone help? TIA!
#Trying to iteratively create new raster datasets
#Based on increments of Source Raster
library(raster)
setwd("C:/Path/To/Folder")
r=raster("Source_Raster.tif") #Raster is floating between 100 and 500
#Create a list of increments I would like to use
list <-seq(100, 500, 50)
#The list creates the following sequence:
# 100 150 200 250 300 350 400 450 500
###THIS IS WHERE I STRUGGLE####
# I would like to use the sequence to create
# new raster datasets that only include values
# from the source raster that are less than or equal to each increment
# for example, the first output raster will contain values less than
# or equal to the first increment (100)
r100 <- calc(r, fun=function(x){ x[x > 100] <- NA; return(x)} )
After you determined the break points, we can use the lapply function to create each raster layer. In this example, r_list is the final output with 9 raster layers.
library(raster)
set.seed(145)
# Create example raster
r <- raster(matrix(runif(100, min = 100, max = 500), ncol = 10))
# Create break points
brk <-seq(100, 500, 50)
# Conduct the operation, create nine raster than smaller than each break points
r_list <- lapply(brk, function(x){
temp <- r
temp[temp > x] <- NA
return(temp)
})
I have two raster layer of dimension (7801, 7651). I want to compare each pixel of one raster layer with the other and create a new raster which has the minimum pixel value among the initial two raster. That is, if any i,j pixel of raster 1 has value 25 and same i,j pixel of raster 2 has value 20, thus in the output raster the i,j pixel should be 20.
You can just use min with two raster layers.
Let's start with a reproducible example:
library(raster)
r1 <- raster(ncol = 5, nrow = 5)
r1[] <- 1:ncell(r1)
plot(r1)
r2 <- raster(ncol = 5, nrow = 5)
r2[] <- ncell(r2):1
par(mfrow = c(1,3))
plot(r1)
plot(r2)
Now we calculate the min of each overlapping cell within the two raster layers very easily with the implemented cell statistics:
r3 <- min(r2, r1)
plot(r3)
Furthermore, you can also apply statistics like mean, max, etc.
If the implemented statistics somehow fail, or you want to use your own statistics, you can also directly access the data per pixel. That is, you first copy one of the raster layers.
r3 <- r1
Afterwards, you can apply a function over the values.
r3[] <- apply(cbind(r1[], r2[]), 1, min)
Using #loki's example, you have three more options to calculate minimum value for both layers:
library(raster)
calc(stack(r1,r2),fun=min,na.rm=T)
stackApply(stack(r1,r2),indices = c(1,1),fun='min',na.rm=T)
overlay(r1,r2,fun=min,na.rm=T)
I have two thematic raster layers r1 and r2 for same area each following same classification scheme and has 16 classes. I need to find minimum distance between cell of r1 and cell of r2 but with same value. E.g. nth cell in r1 has value 10 and coordinates x1,y1. And in r2, there are 2 cells with value 10 and coordinates x1+2,y1+2 and x1-0.5,y1-0.5. Thus the value that I need for this cell would be 0.5,0.5.
I tried distance from raster package but it gives distance, for all cells that are NA, to the nearest cell that is not NA. I am confused as to how can I include second raster layer into this.
You can use knn from class package so that for each cell of r1 find index of nearest cell of r2 with the same category:
library(class)
library(raster)
#example of two rasters
r1 <- raster(ncol = 600, nrow = 300)
r2 <- raster(ncol = 600, nrow = 300)
#fill each with categories that rabge from 1 to 16
r1[] <- sample(1:16, ncell(r1), T)
r2[] <- sample(1:16, ncell(r2), T)
# coordinates of cells extracted
xy = xyFromCell(r1, 1:ncell(r1))
#multiply values of raster with a relatively large number so cells thet belong
#to each category have smaller distance with reagrd to other categories.
v1 = values(r1) * 1000000
v2 = values(r2) * 1000000
# the function returns indices of nearest cells
out = knn(cbind(v2, xy) ,cbind(v1, xy) ,1:ncell(r1), k=1)
So, use rasterToPoints to extract SpatialPoints object for unique thematic class. Then use the sp::spDists function to find the distance between your points.
library(raster)
r1 <- raster( nrow=10,ncol=10)
r2 <- raster( nrow=10,ncol=10)
set.seed(1)
r1[] <- ceiling(runif(100,0,10))
r2[] <- ceiling(runif(100,0,10))
dist.class <- NULL
for(i in unique(values(r1))){
p1 <- rasterToPoints(r1, fun=function(xx) xx==i, spatial=T)
p2 <- rasterToPoints(r2, fun=function(xx) xx==i, spatial=T)
dist.class[i] <- min(spDists(p1,p2))
}
cbind(class = unique(values(r1)),dist.class)
The loop may not be efficient for you. If it's a problem, wrap it into a function and lapply it. Also, be carefull with your class, if they aren't 1:10, my loop won't work. If your projection is in degree, you will probably need the geosphere package to get accurate results. But the best in that case I think is to use a projection in meters.
A memory safe approach using the raster-package would be to use the layerize() function to split up your raster value into a stack of binary rasters (16 in your case) and then use the distance() function to compute distances in the layers of r2, masking them with the respective layers of r1. Something like this:
layers1 <- layerize(r1, falseNA=TRUE)
layers2 <- layerize(r2, falseNA=TRUE)
# now you can loop over the layers (use foreach loop if you want
# to speed things up using parallel processing)
dist.stack <- layers1
for (i in 1:nlayers(r1)) {
dist.i <- distance(layers2[[i]])
dist.mask.i <- mask(dist, layers1[[i]])
dist.stack[[i]] <- dist.mask.i
}
# if you want pairwise distances for all classes in one layer, simply
# combine them using sum()
dist.combine <- sum(dist.stack, na.rm=TRUE)