Gap fill for a raster using another in R - r

I have two rasters for the same day but with different swaths. I want to combine them but I am conscious that retrieval algorithms may be different. Both rasters are of the same dimension. What's the easiest way to do this in R please? I will be running this on a list.
library(raster)
A <- raster(nrows=108, ncols=21, xmn=0, xmx=10)
A[] <- 1:ncell(A)
xy <- matrix(rnorm(ncell(A)),108,21)
B<- raster(xy)
## Induce NAs in raster B:
B[sample(1:ncell(B), 1000)] <- NA
## Confirm we have 1000 NAs:
sum(is.na(B[]))
If there were NA pixels in raster B that had values in the other raster A, how do I fill the raster B based on the correlation between points with values in both rasters A and B, please?

As long as your rasters are of equal dimension and spatially coincident:
## Create indices for pixels that are NA in B and not NA in A:
indices <- is.na(B)[] & !is.na(A)[]
B[indices] <- A[indices]
If they are not already of the same dimension and spatially coincident then use the resample() function first to match the rasters to the desired dimensions and extent (projecting first if necessary using projectRaster()).

Related

How to find area (in hectares) of a raster in R

I am trying to find the area of my raster as shown:
#reading in my raster
abvco2=raster("Avitabile_AGB_Map.tif")
#clipping it to Indonesia
abvco2new=mask(abvco2,Indonesia)
#finding the area
area=area(abvco2new)
However, area() is not helpful as it does not return me with a single area value.
Thank you.
The area function is to compute cell sizes for lon/lat data. From what you say that does not apply to your case. All cells have the same size, so the area of the raster x is
ncell(x) * prod(dim(x)[1:2])
Here is a minimal reproducible example
library(raster)
f <- system.file("external/test.grd", package="raster")
x <- raster(f)
# area of one cell
aone <- prod(dim(x)[1:2])
# total area
ncell(x) * aone
# 84640000
If you want to exclude some areas (e.g. NA) you can use cellStats or zonal
Area excluding NA
b <- cellStats(!is.na(x), sum)
b * aone
#[1] 29237600

Calculating number of pixel(count) raster files in R

I have a huge number of raster files and a polygon that is within the extent of the raster files. I want get the pixel number(count) for each raster files within the polygon.
Additionally I want to create a table with the name of raster file and number of pixels(listed) for each raster file.
I have tried with stacking but with that I can not keep track of the name. Is there any other way of performing this task in R?
Always include example data, please
library(raster)
fn <-system.file("external/rlogo.grd", package="raster")
s <- stack(fn)
s[[1]][1:5000] <- NA
s[[2]][5001:ncell(s)] <- NA
names(s)
#[1] "red" "green" "blue"
p <- rbind(c(5,20), c(25,55), c(50, 20), c(20,6), c(5,20))
pol <- spPolygons(p)
plot(s, addfun=function() lines(pol, lwd=2))
I am not quite sure what you are after. The number of cells (pixels) would be the same for all rasters if you can stack them (which you say you can). I am assuming that you want the sum of the cells that are not NA. If you actually have rasters with a different origin/resolution, you can repeat these steps, but there no need to stack them into a RasterStack, but you would need to adjust the approach to also count the NA cells.
Simple approach for smaller objects:
m <- mask(s, pol)
cellStats(m, function(i, ...) sum(!is.na(i)))
# red green blue
# 600 506 1106
If that runs out of memory, you can do:
m <- mask(s, pol)
x <- reclassify(m, cbind(-Inf, Inf, 1))
names(x) <- names(m)
cellStats(x, 'sum')
#red green blue
#600 506 1106
You can also try:
extract(s, pol, fun=function(x,...)length(na.omit(x)))
# red green blue
#[1,] 600 506 1106
If you want to count all the cells (whether NA or not), you can do something like
# example RasterLayer
r <- s[[1]]
# this step may help in speed if your polygon is small relative to the raster
r <- crop(r, pol)
x <- rasterize(pol, r, 1)
cellStats(x, 'sum')
#[1] 1106

Iterate through Incremental Raster Values in R to Create New Raster Datasets

apologies if this has been asked before but I could not find a solution to my problem, which I believe is a rather simple problem. I have a single source raster dataset containing continuous floating values ranging between 100 and 500. What I would like to do is loop through this source raster in increments of 50 to export/create new raster datasets of all values that are lower than the increment. For example, I have the following R code (using the raster library) to specify the raster and identify the increments. I would to develop a way to automatically create 9 output raster datasets that are less than or equal to the values of each increment. I can't seem to get there. Can anyone help? TIA!
#Trying to iteratively create new raster datasets
#Based on increments of Source Raster
library(raster)
setwd("C:/Path/To/Folder")
r=raster("Source_Raster.tif") #Raster is floating between 100 and 500
#Create a list of increments I would like to use
list <-seq(100, 500, 50)
#The list creates the following sequence:
# 100 150 200 250 300 350 400 450 500
###THIS IS WHERE I STRUGGLE####
# I would like to use the sequence to create
# new raster datasets that only include values
# from the source raster that are less than or equal to each increment
# for example, the first output raster will contain values less than
# or equal to the first increment (100)
r100 <- calc(r, fun=function(x){ x[x > 100] <- NA; return(x)} )
After you determined the break points, we can use the lapply function to create each raster layer. In this example, r_list is the final output with 9 raster layers.
library(raster)
set.seed(145)
# Create example raster
r <- raster(matrix(runif(100, min = 100, max = 500), ncol = 10))
# Create break points
brk <-seq(100, 500, 50)
# Conduct the operation, create nine raster than smaller than each break points
r_list <- lapply(brk, function(x){
temp <- r
temp[temp > x] <- NA
return(temp)
})

Create neighborhood list of large dataset / fasten up

I want to create a weight matrix based on distance. My code for the moment looks as follows and functions for a smaller sample of the data. However, with the large dataset (569424 individuals in 24077 locations) it doesn't go through. The problem arise at the nb2blocknb fuction. So my question would be: How can I optimize my code for large datasets?
# load all survey data
DHS <- read.csv("Daten/final.csv")
attach(DHS)
# define coordinates matrix
coormat <- cbind(DHS$location, DHS$lon_s, DHS$lat_s)
coorm <- cbind(DHS$lon_s, DHS$lat_s)
colnames(coormat) <- c("location", "lon_s", "lat_s")
coo <- cbind(unique(coormat))
c <- as.data.frame(coo)
coor <- cbind(c$lon_s, c$lat_s)
# get a list with beneighbored locations thath are inbetween 50 km distance
neighbor <- dnearneigh(coor, d1 = 0, d2 = 50, row.names=c$location, longlat=TRUE, bound=c("GE", "LE"))
# get neighborhood list on individual level
nb <- nb2blocknb(neighbor, as.character(DHS$location)))
# weight matrix in list format
nbweights.lw <- nb2listw(nb, style="B", zero.policy=TRUE)
Thanks a lot for your help!
you're trying to make 1.3 e10 distance calculations. The results would be in the GB.
I think you'd want to limit either the maximum distance or the number of nearest neighbors you're looking for. Try nn2 from the RANN package:
library('RANN')
nearest_neighbours_w_distance<-nn2(coordinatesA, coordinatesB,10)
note that this operation is not symmetric (Switching coordinatesA and coordinatesB gives different results).
Also you would first have to convert your gps coordinates to a coordinate reference system in which you can calculate euclidean distances, for example UTM (code not tested):
library("sp")
gps2utm<-function(gps_coordinates_matrix,utmzone){
spdf<-SpatialPointsDataFrame(gps_coordinates_matrix[,1],gps_coordinates_matrix[,2])
proj4string(spdf) <- CRS("+proj=longlat +datum=WGS84")
return(spTransform(spdf, CRS(paste0("+proj=utm +zone=",utmzone," ellps=WGS84"))))
}

How to subset a raster based on grid cell values

My following question builds on the solution proposed by #jbaums on this post: Global Raster of geographic distances
For the purpose of reproducing the example, I have a raster dataset of distances to the nearest coastline:
library(rasterVis); library(raster); library(maptools)
data(wrld_simpl)
# Create a raster template for rasterizing the polys.
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=1)
# Rasterize and set land pixels to NA
r2 <- rasterize(wrld_simpl, r, 1)
r3 <- mask(is.na(r2), r2, maskvalue=1, updatevalue=NA)
# Calculate distance to nearest non-NA pixel
d <- distance(r3) # if claculating distances on land instead of ocean: d <- distance(r3)
# Optionally set non-land pixels to NA (otherwise values are "distance to non-land")
d <- d*r2
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
The data looks like this:
From here, I need to subset the distance raster (d), or create a new raster, that only contains cells for which the distance to coastline is less than 200 km. I have tried using getValues() to identify the cells for which the value <= 200 (as show below), but so far without success. Can anyone help? Am I on the right track?
#vector of desired cell numbers
my.pts <- which(getValues(d) <= 200)
# create raster the same size as d filled with NAs
bar <- raster(ncols=ncol(d), nrows=nrow(d), res=res(d))
bar[] <- NA
# replace the values with those in d
bar[my.pts] <- d[my.pts]
I think this is what you are looking for, you can treat a raster like a matrix here right after you d <- d*r2 line:
d[d>=200000]<-NA
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
(in case you forgot: the unit is in meters so the threshold should be 200000, not 200)

Resources