Iterate through Incremental Raster Values in R to Create New Raster Datasets - r

apologies if this has been asked before but I could not find a solution to my problem, which I believe is a rather simple problem. I have a single source raster dataset containing continuous floating values ranging between 100 and 500. What I would like to do is loop through this source raster in increments of 50 to export/create new raster datasets of all values that are lower than the increment. For example, I have the following R code (using the raster library) to specify the raster and identify the increments. I would to develop a way to automatically create 9 output raster datasets that are less than or equal to the values of each increment. I can't seem to get there. Can anyone help? TIA!
#Trying to iteratively create new raster datasets
#Based on increments of Source Raster
library(raster)
setwd("C:/Path/To/Folder")
r=raster("Source_Raster.tif") #Raster is floating between 100 and 500
#Create a list of increments I would like to use
list <-seq(100, 500, 50)
#The list creates the following sequence:
# 100 150 200 250 300 350 400 450 500
###THIS IS WHERE I STRUGGLE####
# I would like to use the sequence to create
# new raster datasets that only include values
# from the source raster that are less than or equal to each increment
# for example, the first output raster will contain values less than
# or equal to the first increment (100)
r100 <- calc(r, fun=function(x){ x[x > 100] <- NA; return(x)} )

After you determined the break points, we can use the lapply function to create each raster layer. In this example, r_list is the final output with 9 raster layers.
library(raster)
set.seed(145)
# Create example raster
r <- raster(matrix(runif(100, min = 100, max = 500), ncol = 10))
# Create break points
brk <-seq(100, 500, 50)
# Conduct the operation, create nine raster than smaller than each break points
r_list <- lapply(brk, function(x){
temp <- r
temp[temp > x] <- NA
return(temp)
})

Related

calculate agreement of two maps at zonal level in R

I am trying to calculate the agreement of two different cropland maps at zonal/ grid level.
I first created the rasters that show cropland agreement between a classified raster and validation data that was created.
agreement<- sample_raster==raster_sampled
crop_agreement <- sample_raster==1 & raster_sampled==1
I then prepared the tiles for zonal analysis by:
tiles_poly$id<- as.double(tiles_poly$id)
tiles_poly$zone <- tiles_poly$id
tiles<-fasterize(st_as_sf(tiles_poly),raster_aoi, field="id")
This works fine.
However, when I'm trying to calculate the agreement at zonal level, I get an error.
zonal_crop_agreement<- zonal(crop_agreement,tiles,fun='sum', na.rm=TRUE)
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
As far as I understand, the error arises because the number of columns is not equal between the two rasters. Yet, when I print the length of tiles & crop_agreement, the have the same length.
I think therefore, that the error might be due to the "fun = 'sum'". I also do not exactly understand what that function does.
Does someone have an idea what the problem could be?
The complete code is here:
## INPUT DATA GOES HERE ##
test_name<- "220624_cropland" # use this to name the test. The exported files will have this name
tiles_poly<-readOGR("C:/data_cropland_burkina/Grid_50km.shp") #the tile grid
raster_aoi <- raster("C:/data_cropland_burkina/empty_raster.tif") #sets the resolution and area of interest
k_rast<-raster("C:data_cropland_burkina/Cropland_Burkina_Copernicus.tif") # this is the classified raster I am analyzing
#k_rast <- rast_class
## This is to convert points into raster ##
sample_points<-readOGR("C:/Users/louis/OneDrive/Desktop/data_cropland_burkina/validation_points_poly0.shp", stringsAsFactors = FALSE) #the validation points
sample_points$class_2 <- as.numeric(sample_points$item) # make sure the class column is numeric
sample_raster<- fasterize(st_as_sf(sample_points), raster_aoi, field="item")
## crop the raster (makes the script run faster) ##
#aoi<- crop(raster_aoi,k_rast)
plot(k_rast)
#plot(sample_points, add=TRUE) ##maybe remove this by commenting out
crop_aoi <- function(x) {crop(x,raster_aoi)}
k_rast <- crop_aoi(k_rast)
sample_raster<-crop_aoi(sample_raster)
## ONLY USE THIS IF THE RASTER DOES NOT HAVE THE SAME RESOLUTION AS THE RASTER_AOI ##
#resample all rasters
resize_aoi<- function(x){raster::resample(x,raster_aoi,method="ngb")}
k_rast<-resize_aoi(k_rast)
##Reclassify if the raster is binary. otherwise comment out.
m <- c(0, 2, 1, 1)
rclmat <- matrix(m, ncol=2, byrow=TRUE)
k_rast <- reclassify(k_rast, rclmat)
## get the unique values from each raster ##
k_rast_vals<-sort(unique(na.omit(getValues(k_rast))))
sample_vals<-unique(na.omit(getValues(sample_raster)))
## extract the crop and non-crop pixels from the sample ##
sample_crop<- sample_raster==1
sample_non_crop<- sample_raster==2
sample_raster_binary<-sample_raster>0
## reduce the classified raster to only the pixels covered by the sample ##
raster_sampled<- k_rast*sample_raster_binary
## Calculate the confusion matrix ##
crop_comp <- function(x) cellStats(raster_sampled==x * sample_crop, stat='sum')
non_crop_comp <- function(x)cellStats(raster_sampled==x * sample_non_crop, stat='sum')
results_crop <- rbind(sapply(k_rast_vals,crop_comp))
colnames(results_crop)<- k_rast_vals
results_non_crop<- rbind(sapply(k_rast_vals,non_crop_comp))
colnames(results_crop)<- k_rast_vals
results<- rbind(results_crop, results_non_crop)
rownames(results) <- c("crop_ref", "non_crop_ref")
results<-cbind(results, total = rowSums(results))
results<-rbind(results, total = colSums(results))
results
## ------SECTION 2- Zonal Analysis - show agreement at the grid level------- ##
## Create rasters that show cropland agreement between the classified raster and validation data ##
agreement<- sample_raster==raster_sampled
crop_agreement <- sample_raster==1 & raster_sampled==1
## prepare the tiles for zonal analysis ##
tiles_poly$id<- as.double(tiles_poly$id)
tiles_poly$zone <- tiles_poly$id
tiles<-fasterize(st_as_sf(tiles_poly),raster_aoi, field="id")
## calculate agreement at the zonal/grid level (thus, the statistics on cell alues of the raster within the ones defined by the other dataset - validation data & input crop maps)##
## next line calculates the zonal crop agreement, taking the crop agreement raster created as well as the prepared tiles as input
## na.rm removed missing values from the data if they are indicated as NA.
## the function fun = 'sum' is
zonal_crop_agreement<- zonal(crop_agreement,tiles,FUN='sum', na.rm=TRUE)
zonal_crop_ref <- zonal(sample_crop,tiles,fun= 'sum', na.rm=TRUE)
zonal_crop_agreement<- data.frame(cbind(zonal_crop_agreement, zonal_crop_ref[,2]))
colnames(zonal_crop_agreement)[3]<- "crop_ref"
zonal_crop_agreement<- data.frame(cbind(zonal_crop_agreement,zonal(raster_sampled==1,tiles,fun="sum",na.rm=TRUE)[,2]))
colnames(zonal_crop_agreement)[4]<- "crop_class"
zonal_crop_agreement<- data.frame(cbind(zonal_crop_agreement,zonal(sample_raster,tiles,fun="count",na.rm=TRUE)[,2]))
colnames(zonal_crop_agreement)[5]<- "samp_pxls"

Create neighborhood list of large dataset / fasten up

I want to create a weight matrix based on distance. My code for the moment looks as follows and functions for a smaller sample of the data. However, with the large dataset (569424 individuals in 24077 locations) it doesn't go through. The problem arise at the nb2blocknb fuction. So my question would be: How can I optimize my code for large datasets?
# load all survey data
DHS <- read.csv("Daten/final.csv")
attach(DHS)
# define coordinates matrix
coormat <- cbind(DHS$location, DHS$lon_s, DHS$lat_s)
coorm <- cbind(DHS$lon_s, DHS$lat_s)
colnames(coormat) <- c("location", "lon_s", "lat_s")
coo <- cbind(unique(coormat))
c <- as.data.frame(coo)
coor <- cbind(c$lon_s, c$lat_s)
# get a list with beneighbored locations thath are inbetween 50 km distance
neighbor <- dnearneigh(coor, d1 = 0, d2 = 50, row.names=c$location, longlat=TRUE, bound=c("GE", "LE"))
# get neighborhood list on individual level
nb <- nb2blocknb(neighbor, as.character(DHS$location)))
# weight matrix in list format
nbweights.lw <- nb2listw(nb, style="B", zero.policy=TRUE)
Thanks a lot for your help!
you're trying to make 1.3 e10 distance calculations. The results would be in the GB.
I think you'd want to limit either the maximum distance or the number of nearest neighbors you're looking for. Try nn2 from the RANN package:
library('RANN')
nearest_neighbours_w_distance<-nn2(coordinatesA, coordinatesB,10)
note that this operation is not symmetric (Switching coordinatesA and coordinatesB gives different results).
Also you would first have to convert your gps coordinates to a coordinate reference system in which you can calculate euclidean distances, for example UTM (code not tested):
library("sp")
gps2utm<-function(gps_coordinates_matrix,utmzone){
spdf<-SpatialPointsDataFrame(gps_coordinates_matrix[,1],gps_coordinates_matrix[,2])
proj4string(spdf) <- CRS("+proj=longlat +datum=WGS84")
return(spTransform(spdf, CRS(paste0("+proj=utm +zone=",utmzone," ellps=WGS84"))))
}

Raster in R: Create Zonal Count of specific cell values without reclassification

I would like to know if there is way to create zonal statistics for RasterLayerObjects, specifically the count of a given cell value (e.g. a land-use class) in R without having to reclassify the whole raster. The solution should be memory efficient in order to work on large raster files i.e. no extraction of the values into a matrix in R is desired.
Below an example of how I handle it until now. In this case I reclassify the original raster to hold only 1 for the value of interest and missings for all other values.
My proposed solution creates both, redundant data and additional processing steps to get me to my initial goal. I thought something like zonal(r1[r1==6],r2,"count") would work but obviously it does not (see below).
# generate reproducible Raster
library("raster")
## RASTER 1 (e.g. land-use classes)
r1 <- raster( crs="+proj=utm +zone=31")
extent(r1) <- extent(0, 100, 0, 100)
res(r1) <- c(5, 5)
values(r1) <- sample(10, ncell(r1), replace=TRUE)
plot(r1)
## RASTER 2 (containing zones of interest)
r2 <- raster( crs="+proj=utm +zone=31")
extent(r2) <- extent(0, 100, 0, 100)
res(r2) <- c(5, 5)
values(r2) <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100))
plot(r2)
# (1) ZONAL STATISTICS
# a. how many cells per zone (independent of specific cell value)
zonal(r1,r2,"count")
# b. how many cells per zone of specific value 6
zonal(r1[r1==6],r2,"count")
# -> fails
# with reclassification
r1.reclass<-
reclassify(r1,
matrix(c(1,5,NA,
5.5,6.5,1, #class of interest
6.5,10,NA),
ncol=3,
byrow = T),
include.lowest=T # include the lowest value from the table.
)
zonal(r1.reclass,r2,"count")
you can use raster::match.
zonal(match(r1, 6),r2, "count")
As you can see from plot(match(r1, 6)), it only returns raster cells which hold the desired value(s). All other cells are NA.
r1==6 as used in your try unfortunately returns a vector and therefore cannot be used in focal anymore.

How to subset a raster based on grid cell values

My following question builds on the solution proposed by #jbaums on this post: Global Raster of geographic distances
For the purpose of reproducing the example, I have a raster dataset of distances to the nearest coastline:
library(rasterVis); library(raster); library(maptools)
data(wrld_simpl)
# Create a raster template for rasterizing the polys.
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=1)
# Rasterize and set land pixels to NA
r2 <- rasterize(wrld_simpl, r, 1)
r3 <- mask(is.na(r2), r2, maskvalue=1, updatevalue=NA)
# Calculate distance to nearest non-NA pixel
d <- distance(r3) # if claculating distances on land instead of ocean: d <- distance(r3)
# Optionally set non-land pixels to NA (otherwise values are "distance to non-land")
d <- d*r2
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
The data looks like this:
From here, I need to subset the distance raster (d), or create a new raster, that only contains cells for which the distance to coastline is less than 200 km. I have tried using getValues() to identify the cells for which the value <= 200 (as show below), but so far without success. Can anyone help? Am I on the right track?
#vector of desired cell numbers
my.pts <- which(getValues(d) <= 200)
# create raster the same size as d filled with NAs
bar <- raster(ncols=ncol(d), nrows=nrow(d), res=res(d))
bar[] <- NA
# replace the values with those in d
bar[my.pts] <- d[my.pts]
I think this is what you are looking for, you can treat a raster like a matrix here right after you d <- d*r2 line:
d[d>=200000]<-NA
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
(in case you forgot: the unit is in meters so the threshold should be 200000, not 200)

Gap fill for a raster using another in R

I have two rasters for the same day but with different swaths. I want to combine them but I am conscious that retrieval algorithms may be different. Both rasters are of the same dimension. What's the easiest way to do this in R please? I will be running this on a list.
library(raster)
A <- raster(nrows=108, ncols=21, xmn=0, xmx=10)
A[] <- 1:ncell(A)
xy <- matrix(rnorm(ncell(A)),108,21)
B<- raster(xy)
## Induce NAs in raster B:
B[sample(1:ncell(B), 1000)] <- NA
## Confirm we have 1000 NAs:
sum(is.na(B[]))
If there were NA pixels in raster B that had values in the other raster A, how do I fill the raster B based on the correlation between points with values in both rasters A and B, please?
As long as your rasters are of equal dimension and spatially coincident:
## Create indices for pixels that are NA in B and not NA in A:
indices <- is.na(B)[] & !is.na(A)[]
B[indices] <- A[indices]
If they are not already of the same dimension and spatially coincident then use the resample() function first to match the rasters to the desired dimensions and extent (projecting first if necessary using projectRaster()).

Resources