R raster functions, splitting multiple rasters from one - r

I have a simple function splitting a raster object into three different classes. However my function doesn't return these rasters. I also read this tutorial http://cran.r-project.org/web/packages/raster/vignettes/functions.pdf
and according to it this is "a really bad way of doing this". However the 'right way' seems overly complicated. Is it really that there is no simple way of doing this (i.e., considering functions should make things easier for you not the vice versa).
I'm quite new to processing rasters with R so forgive me my stupid question..
rm(list=ls(all=T))
r <- raster(ncol=10, nrow=10)
r[] <- rnorm(100,100,5)
# Create split function // three classes
splitrast <- function(rast, quantile) {
print("Splitting raster...")
(q <- quantile(rast, probs=quantile))
r1 <- rast; r2 <- rast; r3 <- rast # copy raster three times
r1[rast > q[1]] <- NA #raster value less than .25 quantile
r2[rast <= q[1] | rast >= q[2]] <- NA #raster values is between quantiles
r3[rast < q[2]] <- NA #raster values is over .75 quantile
par(mfrow=c(1,3))
plot(r1);plot(r2);plot(r3)
rast <- brick(r1,r2,r3)
return(rast)
}
splitrast(r,c(0.2,0.8))
ls()
EDIT: reproducible example added

Don't try to return them separately. Instead return(list(r1,r2,r3)). But see comments about style.

The R raster subset function can help here. After you return the brick, you can subset each band as separate rasters.
# split the raster - returns a three band stack
rasters = splitrast(r,c(0.2,0.8))
# subset each band of the stack as a separate raster
r1 = subset(rasters, 1)
r2 = subset(rasters, 2)
r3 = subset(rasters, 3)
# proof - plot the separate rasters - same as those plotted in the function
plot(r1);plot(r2);plot(r3)

Related

Streamlining binary rasterization in R

I have a few very small country-level polygon and point shapefiles that I would like to rasterize in R. The final product should be one global binary raster (indicating whether grid cell center is covered by a polygon / point lies within cell or not). My approach is to loop over the shapefiles and do the following for each shapefile:
# load shapefile
shp = sf::read_sf(shapefile_path)
# create a global raster template with resolution 0.0083
ext = extent(-180.0042, 180.0042, -65.00417, 75.00417)
gridsize = 0.008333333
r = raster(ext, res = gridsize)
# rasterize polygon or point shapefile to raster
rr = rasterize(shp, r, background = 0) #all grid cells that are not covered get 0
# convert to binary raster
values(rr)[values(rr)>0] = 1
Here, rr is the raster file where the polygons / points in shp are coded as 1 and all other grid cells are coded as 0. Afterwards, I take the sum over all rr to arrive at one global binary raster file including all polygons / points.
The final two steps are incredibly slow. In addition, I get RAM problems when I try to replace the all positive values in rr with 1 as the cell count is very large due to the fine resolution. I was wondering whether it is possible to come up with a smarter solution for what I'd like to achieve.
I have already found the fasterize package that has a speedy implementation of rasterize which works fine. I think it would be of great help if someone has a solution where rasterize directly returns a binary raster.
This is how you can do this better with raster. Note the value=1 argument, and also that that I changed your specification of the extent -- as what you do is probably not correct.
library(raster)
v <- shapefile(shapefile_path)
ext <- extent(-180, 180, -65, 75)
r <- raster(ext, res = 1/120)
rr <- rasterize(v, r, value=1, background = 0)
There is no need for your last step, but you could have done
rr <- clamp(rr, 0, 1)
# or
rr <- rr > 0
# or
rr <- reclassify(rr, cbind(1, Inf, 1))
raster::calc is not very efficient for simple arithmetic like this
It should be much faster to rasterize all vector data in one step, rather than in a loop, especially with large rasters like this (for which the program may need to write a temp file for each iteration).
To illustrate this solution with example data
library(raster)
cds1 <- rbind(c(-180,-20), c(-140,55), c(10, 0), c(-140,-60))
cds2 <- rbind(c(-10,0), c(140,60), c(160,0), c(140,-55))
cds3 <- rbind(c(-125,0), c(0,60), c(40,5), c(15,-45))
v <- spLines(cds1, cds2, cds3)
r <- raster(ncols=90, nrows=45)
r <- rasterize(v, r, field=1)
To speed things up, you can use terra (the replacement for raster)
library(raster)
f <- system.file("ex/lux.shp", package="terra")
v <- as.lines(vect(f))
r <- rast(v, ncol=75, nrow=100)
x <- rasterize(v, r, field=1)
Something that seems to work computationally and significantly improves computation time is to
Create one large shapefile shp instead of working with individual rasterized shapefiles.
Use the fasterize package to rasterize the merged shapefile.
Use raster::calc to avoid memory problems.
ext = extent(-180.0042, 180.0042, -65.00417, 75.00417)
gridsize = 0.008333333
r = raster(ext, res=gridsize)
rr = fasterize(shp, r, background = 0) #all not covered cells get 0, others get sum
# convert to binary raster
fun = function(x) {x[x>0] <- 1; return(x) }
r2 = raster::calc(rr, fun)

Finding minimum distance between two raster layer pixels in R

I have two thematic raster layers r1 and r2 for same area each following same classification scheme and has 16 classes. I need to find minimum distance between cell of r1 and cell of r2 but with same value. E.g. nth cell in r1 has value 10 and coordinates x1,y1. And in r2, there are 2 cells with value 10 and coordinates x1+2,y1+2 and x1-0.5,y1-0.5. Thus the value that I need for this cell would be 0.5,0.5.
I tried distance from raster package but it gives distance, for all cells that are NA, to the nearest cell that is not NA. I am confused as to how can I include second raster layer into this.
You can use knn from class package so that for each cell of r1 find index of nearest cell of r2 with the same category:
library(class)
library(raster)
#example of two rasters
r1 <- raster(ncol = 600, nrow = 300)
r2 <- raster(ncol = 600, nrow = 300)
#fill each with categories that rabge from 1 to 16
r1[] <- sample(1:16, ncell(r1), T)
r2[] <- sample(1:16, ncell(r2), T)
# coordinates of cells extracted
xy = xyFromCell(r1, 1:ncell(r1))
#multiply values of raster with a relatively large number so cells thet belong
#to each category have smaller distance with reagrd to other categories.
v1 = values(r1) * 1000000
v2 = values(r2) * 1000000
# the function returns indices of nearest cells
out = knn(cbind(v2, xy) ,cbind(v1, xy) ,1:ncell(r1), k=1)
So, use rasterToPoints to extract SpatialPoints object for unique thematic class. Then use the sp::spDists function to find the distance between your points.
library(raster)
r1 <- raster( nrow=10,ncol=10)
r2 <- raster( nrow=10,ncol=10)
set.seed(1)
r1[] <- ceiling(runif(100,0,10))
r2[] <- ceiling(runif(100,0,10))
dist.class <- NULL
for(i in unique(values(r1))){
p1 <- rasterToPoints(r1, fun=function(xx) xx==i, spatial=T)
p2 <- rasterToPoints(r2, fun=function(xx) xx==i, spatial=T)
dist.class[i] <- min(spDists(p1,p2))
}
cbind(class = unique(values(r1)),dist.class)
The loop may not be efficient for you. If it's a problem, wrap it into a function and lapply it. Also, be carefull with your class, if they aren't 1:10, my loop won't work. If your projection is in degree, you will probably need the geosphere package to get accurate results. But the best in that case I think is to use a projection in meters.
A memory safe approach using the raster-package would be to use the layerize() function to split up your raster value into a stack of binary rasters (16 in your case) and then use the distance() function to compute distances in the layers of r2, masking them with the respective layers of r1. Something like this:
layers1 <- layerize(r1, falseNA=TRUE)
layers2 <- layerize(r2, falseNA=TRUE)
# now you can loop over the layers (use foreach loop if you want
# to speed things up using parallel processing)
dist.stack <- layers1
for (i in 1:nlayers(r1)) {
dist.i <- distance(layers2[[i]])
dist.mask.i <- mask(dist, layers1[[i]])
dist.stack[[i]] <- dist.mask.i
}
# if you want pairwise distances for all classes in one layer, simply
# combine them using sum()
dist.combine <- sum(dist.stack, na.rm=TRUE)

repeat same raster layer to create a raster stack

I am trying to create a raster stack from a rasterlayer, where the raster stack is just the same raster layer repeated a certain number of times.
I can do something like this:
library(raster)
rasterstack <- addLayer(rasterlayer, rasterLayer, rasterLayer)
and this works. However, i want the stack to be about a 1000 layers. I guess i could just loop through, but i was wondering if there was a more sophisticated way to doing this.
The reason I am trying to do this is to calculate the weighted mean of a raster stack with data where each layer is a different time period, and where the weights are in a different raster layer object. I am hoping that if I create a rasterstack from the weights raster layer with the same number of layers as the data, I'll be able to do something like:
weightedmean <- weighted.mean( data.RasterStack, weights.RasterStack )
Example data
library(raster)
r <- raster(ncol=10, nrow=10, vals=1:100)
Solution
n <- 10 # number of copies
s <- stack(lapply(1:n, function(i) r))
Or
s <- stack(replicate(n, r))

Set single raster to NA where values of raster stack are NA

I have two 30m x 30m raster files which I would like to sample points from. Prior to sampling, I would like to remove the clouded areas from the images. I turned to R and Hijman's Raster package for the task.
Using the drawPoly(sp=TRUE) command, I drew in 18 different polygons. The function did not seem to allow 18 polygons as one sp object, so I drew them all separately. I then gave the polygons a proj4string matching the rasters', and set them into a list. I ran the list through a lapply function to convert them to rasters (rasterize function in Hijman's package) with the polygon areas set to NA, and the rest of the image set to 1.
My end goal is one raster layer with the 18 areas set to NA. I have tried stacking the list of rasterized polygons, and subsetting it to put set a new raster to NA in the same areas. My reproducible code is below.
library(raster)
r1 <- raster(nrow=50, ncol = 50)
r1[] <- 1
r1[4:10,] <- NA
r2 <- raster(nrow=50, ncol = 50)
r2[] <- 1
r2[9:15,] <- NA
r3 <- raster(nrow=50, ncol = 50)
r3[] <- 1
r3[24:39,] <- NA
r4 <- raster(nrow=50, ncol = 50)
r4[] <- 1
s <- stack(r1, r2, r3)
test.a.cool <- calc(s, function(x){r4[is.na(x)==1] <- NA})
For whatever reason, the darn testacool is a blank plot, where I'm aiming to have it as a raster with all values except for the NAs in the stack, s, equal to 1.
Any tips?
Thanks.
Doing sum(s) will work, as sum() returns NA for any grid cell with even one NA value in the stack.
To see that it works, compare the figures produced by the following:
plot(s)
plot(sum(s))
I posted this question on the R-Sig-Geo forum, as well, and received a response from the package author. The two simplest solutions:
Use the sp package to rbind my polygons into one, then rasterize the polygon.
p <- rbind(p1, p2, p3...etc., makeUniqueIDs = TRUE)
r4 <- raster(nrow=50, ncol = 50)
r4[] <- 1
mask <- rasterize(p, r4)
mask[mask %in% 1:18] <- 1
#The above code produces a single raster file with
#my polygons as unique values, ready for masking.
And the second simple solution, as just pointed out by Josh O'Brien:
m <- sum(s)
test <- mask(r4, m)
The R community rocks. Problem solved (twice) within an hour. Thanks.
I'm not familiar with the package you are using, however looking at the final line in your code, I think the issue might be here:
function(x){r4[is.na(x)==1] <- NA})
It doesn't look like calc will do much with that. It is setting the values of r4 indexed by the NAs of x and setting those to NA.
What then? If anything, maybe:
function(x){r4[is.na(x)==1] <- NA; return(r4) })
Although, it's not clear if that is even what you are after.
You were on the right track. The [ operator is defined for rasters and raster stacks, so you could just use the single line:
r4[ any(is.na(s) ) ] <- NA
plot(r4)
If you wanted to use calc you could have used it like this:
r4 <- calc( s, function(x){ ( ! any( is.na(x) ) ) } )
r4[is.na(r4)] <- NA
plot(r4)

Function for resizing matrices in R

I was wondering if there was a function that scales down matrices in R statistical software exactly like with image resizing. The function imresize() in MATLAB is exactly what I'm looking for (I believe it takes the average of the surrounding points, but I am not sure of this), but I am wondering if there is an R equivalent for this function.
This question has been posted before on this forum, but with reference to MATLAB, not R:
Matlab "Scale Down" a Vector with Averages
The post starting with "Any reason why you can't use the imresize() function?" is exactly what I am looking for, but in R, not MATLAB.
Say I have a latitude-longitude grid of temperatures around the world, and let's say this is represented by a 64*128 matrix of temperatures. Now let's say I would like to have the same data contained in a new matrix, but I would like to rescale my grid to make it a 71*114 matrix of temperatures around the world. A function that would allow me to do so is what I'm looking for (again, the imresize() function, but in R, not MATLAB)
Thank you.
Steve
One way to do this is by using the function resample(), from the raster package.
I'll first show how you could use it to rescale your grid, and then give an easier-to-inspect example of its application to smaller raster objects
Use resample() to resize matrices
library(raster)
m <- matrix(seq_len(68*128), nrow=68, ncol=128, byrow=TRUE)
## Convert matrix to a raster with geographical coordinates
r <- raster(m)
extent(r) <- extent(c(-180, 180, -90, 90))
## Create a raster with the desired dimensions, and resample into it
s <- raster(nrow=71, ncol=114)
s <- resample(r,s)
## Convert resampled raster back to a matrix
m2 <- as.matrix(s)
Visually confirm that resample() does what you'd like:
library(raster)
## Original data (4x4)
rr <- raster(ncol=4, nrow=4)
rr[] <- 1:16
## Resize to 5x5
ss <- raster(ncol=5, nrow=5)
ss <- resample(rr, ss)
## Resize to 3x3
tt <- raster(ncol=3, nrow=3)
tt <- resample(rr, tt)
## Plot for comparison
par(mfcol=c(2,2))
plot(rr, main="original data")
plot(ss, main="resampled to 5-by-5")
plot(tt, main="resampled to 3-by-3")
The answer posted by Josh O'Brien is OK and it helped me (for starting point), but this approach was too slow since I had huge list of data. The method below is good alternative. It uses fields and works much faster.
Functions
rescale <- function(x, newrange=range(x)){
xrange <- range(x)
mfac <- (newrange[2]-newrange[1])/(xrange[2]-xrange[1])
newrange[1]+(x-xrange[1])*mfac
}
ResizeMat <- function(mat, ndim=dim(mat)){
if(!require(fields)) stop("`fields` required.")
# input object
odim <- dim(mat)
obj <- list(x= 1:odim[1], y=1:odim[2], z= mat)
# output object
ans <- matrix(NA, nrow=ndim[1], ncol=ndim[2])
ndim <- dim(ans)
# rescaling
ncord <- as.matrix(expand.grid(seq_len(ndim[1]), seq_len(ndim[2])))
loc <- ncord
loc[,1] = rescale(ncord[,1], c(1,odim[1]))
loc[,2] = rescale(ncord[,2], c(1,odim[2]))
# interpolation
ans[ncord] <- interp.surface(obj, loc)
ans
}
Lets look how it works
## Original data (4x4)
rr <- matrix(1:16, ncol=4, nrow=4)
ss <- ResizeMat(rr, c(5,5))
tt <- ResizeMat(rr, c(3,3))
## Plot for comparison
par(mfcol=c(2,2), mar=c(1,1,2,1))
image(rr, main="original data", axes=FALSE)
image(ss, main="resampled to 5-by-5", axes=FALSE)
image(tt, main="resampled to 3-by-3", axes=FALSE)

Resources