Calculating number of pixel(count) raster files in R - r

I have a huge number of raster files and a polygon that is within the extent of the raster files. I want get the pixel number(count) for each raster files within the polygon.
Additionally I want to create a table with the name of raster file and number of pixels(listed) for each raster file.
I have tried with stacking but with that I can not keep track of the name. Is there any other way of performing this task in R?

Always include example data, please
library(raster)
fn <-system.file("external/rlogo.grd", package="raster")
s <- stack(fn)
s[[1]][1:5000] <- NA
s[[2]][5001:ncell(s)] <- NA
names(s)
#[1] "red" "green" "blue"
p <- rbind(c(5,20), c(25,55), c(50, 20), c(20,6), c(5,20))
pol <- spPolygons(p)
plot(s, addfun=function() lines(pol, lwd=2))
I am not quite sure what you are after. The number of cells (pixels) would be the same for all rasters if you can stack them (which you say you can). I am assuming that you want the sum of the cells that are not NA. If you actually have rasters with a different origin/resolution, you can repeat these steps, but there no need to stack them into a RasterStack, but you would need to adjust the approach to also count the NA cells.
Simple approach for smaller objects:
m <- mask(s, pol)
cellStats(m, function(i, ...) sum(!is.na(i)))
# red green blue
# 600 506 1106
If that runs out of memory, you can do:
m <- mask(s, pol)
x <- reclassify(m, cbind(-Inf, Inf, 1))
names(x) <- names(m)
cellStats(x, 'sum')
#red green blue
#600 506 1106
You can also try:
extract(s, pol, fun=function(x,...)length(na.omit(x)))
# red green blue
#[1,] 600 506 1106
If you want to count all the cells (whether NA or not), you can do something like
# example RasterLayer
r <- s[[1]]
# this step may help in speed if your polygon is small relative to the raster
r <- crop(r, pol)
x <- rasterize(pol, r, 1)
cellStats(x, 'sum')
#[1] 1106

Related

How to find area (in hectares) of a raster in R

I am trying to find the area of my raster as shown:
#reading in my raster
abvco2=raster("Avitabile_AGB_Map.tif")
#clipping it to Indonesia
abvco2new=mask(abvco2,Indonesia)
#finding the area
area=area(abvco2new)
However, area() is not helpful as it does not return me with a single area value.
Thank you.
The area function is to compute cell sizes for lon/lat data. From what you say that does not apply to your case. All cells have the same size, so the area of the raster x is
ncell(x) * prod(dim(x)[1:2])
Here is a minimal reproducible example
library(raster)
f <- system.file("external/test.grd", package="raster")
x <- raster(f)
# area of one cell
aone <- prod(dim(x)[1:2])
# total area
ncell(x) * aone
# 84640000
If you want to exclude some areas (e.g. NA) you can use cellStats or zonal
Area excluding NA
b <- cellStats(!is.na(x), sum)
b * aone
#[1] 29237600

How can I get the number of pixels with NA value in a raster that is cliped from a large raster by many polygons?

I applied cloud mask to a raster image in R, and want to check how many pixels are masked out. But what I really need are only the images within some polygons (400+ of them), so I only want to get the number of pixels with no value within the polygons.
Here is what I have done:
library(raster)
library(rgdal)
##Read the raster files
tb = raster('D:/HLS/NDVI_Month_2018_TB.tif', band = 6)
##Read the polygon (400 polygons)
crops = readOGR('D:/HLS/shapefile/tb/tb.shp')
##reproject the vector
new_crops = spTransform(crops, crs(tb))
##Clip the raster with polygons
cliped = crop(tb, extent(new_crops))
output = mask(cliped, new_crops)
##Check the NA value
freq(output, value = NA)
However what I got from the freq() function seems to be all the pixels within the area (not only the polygons but the area from crop() function).
The result of freq():
How can I get the NA value within the polygons?
Here is a minimal, self-contained, reproducible example (taken mostly from ?raster::extract)
Example raster and polygons
library(raster)
r <- raster(ncol=90, nrow=45)
values(r) <- 1:ncell(r)
r[seq(1,ncell(r),3)] <- NA
p1 <- rbind(c(-180,-20), c(-140,55), c(0, 0), c(-140,-60), c(-180,-20))
p2 <- rbind(c(10,0), c(140,60), c(160,0), c(140,-55), c(10,0))
pols <- spPolygons(p1, p2)
Solution 1
extract(r, pols, fun=function(i, ...) sum(is.na(i)))
# [,1]
#[1,] 215
#[2,] 178
Solution 2
z <- rasterize(pols, r)
zonal(is.na(r), z, "sum")
# zone sum
#[1,] 1 215
#[2,] 2 178

How to subset a raster based on grid cell values

My following question builds on the solution proposed by #jbaums on this post: Global Raster of geographic distances
For the purpose of reproducing the example, I have a raster dataset of distances to the nearest coastline:
library(rasterVis); library(raster); library(maptools)
data(wrld_simpl)
# Create a raster template for rasterizing the polys.
r <- raster(xmn=-180, xmx=180, ymn=-90, ymx=90, res=1)
# Rasterize and set land pixels to NA
r2 <- rasterize(wrld_simpl, r, 1)
r3 <- mask(is.na(r2), r2, maskvalue=1, updatevalue=NA)
# Calculate distance to nearest non-NA pixel
d <- distance(r3) # if claculating distances on land instead of ocean: d <- distance(r3)
# Optionally set non-land pixels to NA (otherwise values are "distance to non-land")
d <- d*r2
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
The data looks like this:
From here, I need to subset the distance raster (d), or create a new raster, that only contains cells for which the distance to coastline is less than 200 km. I have tried using getValues() to identify the cells for which the value <= 200 (as show below), but so far without success. Can anyone help? Am I on the right track?
#vector of desired cell numbers
my.pts <- which(getValues(d) <= 200)
# create raster the same size as d filled with NAs
bar <- raster(ncols=ncol(d), nrows=nrow(d), res=res(d))
bar[] <- NA
# replace the values with those in d
bar[my.pts] <- d[my.pts]
I think this is what you are looking for, you can treat a raster like a matrix here right after you d <- d*r2 line:
d[d>=200000]<-NA
levelplot(d/1000, margin=FALSE, at=seq(0, maxValue(d)/1000, length=100),colorkey=list(height=0.6), main='Distance to coast (km)')
(in case you forgot: the unit is in meters so the threshold should be 200000, not 200)

Gap fill for a raster using another in R

I have two rasters for the same day but with different swaths. I want to combine them but I am conscious that retrieval algorithms may be different. Both rasters are of the same dimension. What's the easiest way to do this in R please? I will be running this on a list.
library(raster)
A <- raster(nrows=108, ncols=21, xmn=0, xmx=10)
A[] <- 1:ncell(A)
xy <- matrix(rnorm(ncell(A)),108,21)
B<- raster(xy)
## Induce NAs in raster B:
B[sample(1:ncell(B), 1000)] <- NA
## Confirm we have 1000 NAs:
sum(is.na(B[]))
If there were NA pixels in raster B that had values in the other raster A, how do I fill the raster B based on the correlation between points with values in both rasters A and B, please?
As long as your rasters are of equal dimension and spatially coincident:
## Create indices for pixels that are NA in B and not NA in A:
indices <- is.na(B)[] & !is.na(A)[]
B[indices] <- A[indices]
If they are not already of the same dimension and spatially coincident then use the resample() function first to match the rasters to the desired dimensions and extent (projecting first if necessary using projectRaster()).

Set single raster to NA where values of raster stack are NA

I have two 30m x 30m raster files which I would like to sample points from. Prior to sampling, I would like to remove the clouded areas from the images. I turned to R and Hijman's Raster package for the task.
Using the drawPoly(sp=TRUE) command, I drew in 18 different polygons. The function did not seem to allow 18 polygons as one sp object, so I drew them all separately. I then gave the polygons a proj4string matching the rasters', and set them into a list. I ran the list through a lapply function to convert them to rasters (rasterize function in Hijman's package) with the polygon areas set to NA, and the rest of the image set to 1.
My end goal is one raster layer with the 18 areas set to NA. I have tried stacking the list of rasterized polygons, and subsetting it to put set a new raster to NA in the same areas. My reproducible code is below.
library(raster)
r1 <- raster(nrow=50, ncol = 50)
r1[] <- 1
r1[4:10,] <- NA
r2 <- raster(nrow=50, ncol = 50)
r2[] <- 1
r2[9:15,] <- NA
r3 <- raster(nrow=50, ncol = 50)
r3[] <- 1
r3[24:39,] <- NA
r4 <- raster(nrow=50, ncol = 50)
r4[] <- 1
s <- stack(r1, r2, r3)
test.a.cool <- calc(s, function(x){r4[is.na(x)==1] <- NA})
For whatever reason, the darn testacool is a blank plot, where I'm aiming to have it as a raster with all values except for the NAs in the stack, s, equal to 1.
Any tips?
Thanks.
Doing sum(s) will work, as sum() returns NA for any grid cell with even one NA value in the stack.
To see that it works, compare the figures produced by the following:
plot(s)
plot(sum(s))
I posted this question on the R-Sig-Geo forum, as well, and received a response from the package author. The two simplest solutions:
Use the sp package to rbind my polygons into one, then rasterize the polygon.
p <- rbind(p1, p2, p3...etc., makeUniqueIDs = TRUE)
r4 <- raster(nrow=50, ncol = 50)
r4[] <- 1
mask <- rasterize(p, r4)
mask[mask %in% 1:18] <- 1
#The above code produces a single raster file with
#my polygons as unique values, ready for masking.
And the second simple solution, as just pointed out by Josh O'Brien:
m <- sum(s)
test <- mask(r4, m)
The R community rocks. Problem solved (twice) within an hour. Thanks.
I'm not familiar with the package you are using, however looking at the final line in your code, I think the issue might be here:
function(x){r4[is.na(x)==1] <- NA})
It doesn't look like calc will do much with that. It is setting the values of r4 indexed by the NAs of x and setting those to NA.
What then? If anything, maybe:
function(x){r4[is.na(x)==1] <- NA; return(r4) })
Although, it's not clear if that is even what you are after.
You were on the right track. The [ operator is defined for rasters and raster stacks, so you could just use the single line:
r4[ any(is.na(s) ) ] <- NA
plot(r4)
If you wanted to use calc you could have used it like this:
r4 <- calc( s, function(x){ ( ! any( is.na(x) ) ) } )
r4[is.na(r4)] <- NA
plot(r4)

Resources