Function for resizing matrices in R - r

I was wondering if there was a function that scales down matrices in R statistical software exactly like with image resizing. The function imresize() in MATLAB is exactly what I'm looking for (I believe it takes the average of the surrounding points, but I am not sure of this), but I am wondering if there is an R equivalent for this function.
This question has been posted before on this forum, but with reference to MATLAB, not R:
Matlab "Scale Down" a Vector with Averages
The post starting with "Any reason why you can't use the imresize() function?" is exactly what I am looking for, but in R, not MATLAB.
Say I have a latitude-longitude grid of temperatures around the world, and let's say this is represented by a 64*128 matrix of temperatures. Now let's say I would like to have the same data contained in a new matrix, but I would like to rescale my grid to make it a 71*114 matrix of temperatures around the world. A function that would allow me to do so is what I'm looking for (again, the imresize() function, but in R, not MATLAB)
Thank you.
Steve

One way to do this is by using the function resample(), from the raster package.
I'll first show how you could use it to rescale your grid, and then give an easier-to-inspect example of its application to smaller raster objects
Use resample() to resize matrices
library(raster)
m <- matrix(seq_len(68*128), nrow=68, ncol=128, byrow=TRUE)
## Convert matrix to a raster with geographical coordinates
r <- raster(m)
extent(r) <- extent(c(-180, 180, -90, 90))
## Create a raster with the desired dimensions, and resample into it
s <- raster(nrow=71, ncol=114)
s <- resample(r,s)
## Convert resampled raster back to a matrix
m2 <- as.matrix(s)
Visually confirm that resample() does what you'd like:
library(raster)
## Original data (4x4)
rr <- raster(ncol=4, nrow=4)
rr[] <- 1:16
## Resize to 5x5
ss <- raster(ncol=5, nrow=5)
ss <- resample(rr, ss)
## Resize to 3x3
tt <- raster(ncol=3, nrow=3)
tt <- resample(rr, tt)
## Plot for comparison
par(mfcol=c(2,2))
plot(rr, main="original data")
plot(ss, main="resampled to 5-by-5")
plot(tt, main="resampled to 3-by-3")

The answer posted by Josh O'Brien is OK and it helped me (for starting point), but this approach was too slow since I had huge list of data. The method below is good alternative. It uses fields and works much faster.
Functions
rescale <- function(x, newrange=range(x)){
xrange <- range(x)
mfac <- (newrange[2]-newrange[1])/(xrange[2]-xrange[1])
newrange[1]+(x-xrange[1])*mfac
}
ResizeMat <- function(mat, ndim=dim(mat)){
if(!require(fields)) stop("`fields` required.")
# input object
odim <- dim(mat)
obj <- list(x= 1:odim[1], y=1:odim[2], z= mat)
# output object
ans <- matrix(NA, nrow=ndim[1], ncol=ndim[2])
ndim <- dim(ans)
# rescaling
ncord <- as.matrix(expand.grid(seq_len(ndim[1]), seq_len(ndim[2])))
loc <- ncord
loc[,1] = rescale(ncord[,1], c(1,odim[1]))
loc[,2] = rescale(ncord[,2], c(1,odim[2]))
# interpolation
ans[ncord] <- interp.surface(obj, loc)
ans
}
Lets look how it works
## Original data (4x4)
rr <- matrix(1:16, ncol=4, nrow=4)
ss <- ResizeMat(rr, c(5,5))
tt <- ResizeMat(rr, c(3,3))
## Plot for comparison
par(mfcol=c(2,2), mar=c(1,1,2,1))
image(rr, main="original data", axes=FALSE)
image(ss, main="resampled to 5-by-5", axes=FALSE)
image(tt, main="resampled to 3-by-3", axes=FALSE)

Related

How do you temporally interpolate a large RasterStack object to a higher periodicity (weekly to daily)

In R, I am trying to interpolate between stacks that were created a at weekly time interval, to a daily time interval. Interpolation method can be nearest neighbor or linear interpolation.
I have seen this can be done for time series using na.approx or a spline.
Also, I would like to keep the object as a Stack (no dataframe) if possible.
#Dummy example
#---#
library(raster)
# Create date sequence
idx <- seq(as.Date("2000/1/1"), as.Date("2000/12/31"), by = "week")
# Create raster stack and assign dates
r <- raster(ncol=20, nrow=20)
s <- stack(lapply(1:length(idx), function(x) setValues(r,
runif(ncell(r)))))
s <- setZ(s, idx)
# Do interpolation to daily resolution
# (Perhaps it should be done one by one, perhaps all at once...)
# ...
Say my actual stack has dimensions c(20,20,52), the result would have dimensions c(20,20,366).
Thank for your help
You need to write a function f, that does this for a vector (a cell), say s[1]. Then apply this function using calc, as in calc(s, f)
Here is a simple example that uses approx, that can be replaced by spline or other interpolators
library(raster)
r <- raster(ncol=20, nrow=20)
s <- stack(lapply(1:length(idx), function(x) setValues(r, runif(ncell(r)))))
idx <- seq(as.Date("2000/1/1"), as.Date("2000/12/31"), by = "week")
dr <- seq(as.Date("2000/1/1"), as.Date("2000/12/31"), by = "day")
f <- function(x) approx(idx, x, dr, rule=2)$y
# test <- f(s[1])
x <- calc(s, f)
Results for one cell
plot(dr, as.vector(x[1]), pch="+")
points(idx, as.vector(s[1]), pch=20, col="red", cex=2)
lines(idx, as.vector(s[1]), col="blue")

R: Bilinear interpolation to fill gaps in R

I have a grid that contains gaps (NAs) that I want to fill using interpolation. My grid shows autocorrelation in the x and y dimensions, so I would like to try bilinear interpolation. Most of the solutions I have found are focused on 'upsampling' (interpolation for the purpose of increasing number of samples/size of grid), but I do not want/need to change the grid size. I just want to fill NAs using interpolation. Other potential solutions do not seem to handle NAs for the input grid of values (the 'z matrix'), or are neighborhood-based solutions rather than bilinear interpoloation, or simply have no answer.
I found that with the raster package, I can input a grid (as a raster) that contains NAs, and use the 'resample' command to output a grid of the same size. However, the results look like nearest neighbor interpolation rather than bilinear interpolation.
Am I missing something such that there is a way to do bilinear interpolation with the raster package? Or is there a better way to do bilinear interpolation simply to fill NAs?
library(raster)
# raster containing gap
r <- raster(nrow=10, ncol=10)
r[] <- 1:ncell(r)
r[25] <- NA
# The s raster is the same size as the r raster
s <- raster(nrow=10, ncol=10)
s <- resample(r, s, method='bilinear')
plot(r)
plot(s)
s[25]
s[35]
# s[25] appears to have been filled with neighbor s[35]
UPDATE
The Akima package seems like a promising alternative to the raster approach above, but I'm having trouble if there are NAs in the input grid of values (the Z matrix). Here's an example parallel to the example above to demonstrate. (Again, I'm interpolating to a grid the same size as the original).
library(akima)
# Use bilinear interpolation (no NAs in input)
rmat<-matrix(seq(1,100,1), nrow = 10, ncol = 10, byrow = T)
x <- seq(1,10,1)
y <- seq(1,10,1)
smat <- bilinear.grid(x, y, rmat, nx = 10, ny = 10) # works
plot(raster(rmat), main = "original")
plot(raster(smat$z), main = "interpolated")
# Try using bilinear interpolation but with an NA
rmat<-matrix(seq(1,100,1), nrow = 10, ncol = 10, byrow = T)
rmat[3,5] <- NA
x <- seq(1,10,1)
y <- seq(1,10,1)
smat <- bilinear.grid(x, y, rmat, nx = 10, ny = 10) # Error about NAs
UPDATE2
There was a great question from #Robert Hijmans about why not use a moving window average with the focal() command in the raster package. The reason is that I want to try bilinear interpolation, and I don't think a moving window average always gives the same answer as bilinear interpolation. However, this was not clear in the example I posted (in that example moving window and bilinear interp do give the same answer), so I'll demonstrate in a new example below. Note that the bilinear interpolation solution should be 8 for the example below (here is a handy calculator for tests).
library(raster)
r <- raster(nrow=10, ncol=10)
# Different grid values than earlier examples
values(r) <- c(rep(1:5, 4), rep(4:8, 4), rep(1:5, 4), rep(4:8, 4), rep(1:5, 4))
r[25] <- NA
plot(r)
# See what the mean of the moving window produces
f <- focal(r, w=matrix(1,nrow=3, ncol=3), fun=mean, NAonly=TRUE, na.rm=TRUE)
f[25] # Moving window gives 5 but bilinear interp gives 8
# Note that this seems to be how the moving window works with equal weights
window_test <- c(r[14:16], r[24:26], r[34:36])
mean(window_test, na.rm = T)
Am I missing something here? Maybe there is something clever with the weights argument of focal() that can produce a bilinear interpolation solution?
Let's use equal distance cells to avoid differences because of cell size variation with lon/lat data
library(raster)
r <- raster(nrow=10, ncol=10, crs='+proj=utm +zone=1 +datum=WGS84', xmn=0, xmx=1, ymn=0, ymx=1)
For this example, you might use focal
values(r) <- 1:ncell(r)
r[25] <- NA
f <- focal(r, w=matrix(1,nrow=3, ncol=3), fun=mean, NAonly=TRUE, na.rm=TRUE)
I see that you dismiss "neighborhood-based solutions rather than bilinear interpoloation". But the question is why. In this case, you may want a neighborhood-based solution.
Update. Then again, in case of cells that are not approximately square, bilinear would be preferable.
values(r) <- c(rep(1:5, 4), rep(4:8, 4), rep(1:5, 4), rep(4:8, 4), rep(1:5, 4))
r[25] <- NA
The problem with bilinear interpolation normally uses 4 contiguous cells, but in this case, where you want the value for the center of a cell, the appropriate cell would be the value of the cell itself, because the distance to that cell is zero, and thus that is where the interpolation ends up. For example, for cell 23
extract(r, xyFromCell(r, 23))
#6
extract(r, xyFromCell(r, 23), method='bilinear')
#[1] 6
In this case the focal cell is NA, so you get the average of the focal cell and 3 more cells. The question is which three? It is arbitrary, but to make it work, the NA cell must get a value. The raster algorithm assigns the value below the NA cell to that cell (also 8 here). This works well, I think, to deal with NA values at edges (e.g. land/ocean), but perhaps not in this case.
`
extract(r, xyFromCell(r, 25))
#NA
extract(r, xyFromCell(r, 25), method='bilinear')
#[1] 8
That is also what resample gives
resample(r, r)[25]
# 8
Is this what the on-line calculator suggests too?
This is very sensitive to small changes
extract(r, xyFromCell(r, 25)+0.0001, method='bilinear')
#[1] 4.998997
What I would really want in this case is the mean of the rook-neighbors
mean(r[adjacent(r, 25, pairs=FALSE)])
[1] 6
Or, more generally, the local inverse distance weighted average. You can compute
that by setting up a weights matrix with focal
# compute weights matrix
a <- sort(adjacent(r, 25, 8, pairs=F, include=TRUE))
axy <- xyFromCell(r, a)
d <- pointDistance(axy, xyFromCell(r, 25), lonlat=F)
w <- matrix(d, 3, 3)
w[2,2] <- 0
w <- w / sum(w)
# A simpler approach could be:
# w <- matrix(c(0,.25,0,.25,0,.25,0,.25,0), 3, 3)
foc <- focal(r, w, na.rm=TRUE, NAonly=TRUE)
foc[25]
In this example this is fine; but it would not be correct if there were multiple NA values in the focal area (as the sum of weights would no longer be 1). We can correct for that by computing the sum of weights
x <- as.integer(r/r)
sum_weights <- focal(x, w, na.rm=TRUE, NAonly=TRUE)
fw <- foc/sum_weights
done <- cover(r, fw)
done[25]

Transfering Conditional command from Raster Calculator of ArcGis to R

I would like to transfer the following code from Raster calculator of ArcGis to R.
Con("diff_canopy" >= 1), "diff_canopy")
This estimates a new raster which only contains the data from diff_canopy where diff_canopy is greater or equal than 1.
To solve this, I followed and adapted the code proposed in this post:
test <- raster (extent(canopy_sjd), nrows=nrow(canopy_sjd), ncols=ncol(canopy_sjd))
test[canopy_sjd[]>=1] <- canopy_sjd[canopy_sjd[] >=1]
The code works fine, however, when I compare the raster obtained with R code with the raster obtained directly with ArcGis calculator, I obtained different values:
From ArcGis calculator: min 1.01598 max 10.0271
From adapted R code: min 1.01598 max 11.7207
My questions are the following:
1) The adapted code match with the raster calculator statment?
2) If it matches, why the max values between output rasters differs?
3) If it do not matches, any other suggestions to fix the error?
Always include some example data:
library(raster)
canopy <- raster(nrow=4, ncol=4, xmn=0, xmx=1, ymn=0, ymx=1, crs='+proj=utm +zone=1')
values(canopy) <- 1:ncell(canopy)
canopy <- canopy - 5
Here is a simple solution:
x <- reclassify(canopy, cbind(-Inf, 1, NA), right=FALSE)
An alternative:
y <- mask(canopy, canopy >=1, maskvalue=0)
One more:
z <- calc(canopy, function(i){ i[i<1] <- NA; i})
For small data sets, it is possible to use your solution (but not recommended). I would do it like this:
a <- raster(canopy)
i <- which(values(canopy) >= 1)
a[i] <- canopy[i]

R raster functions, splitting multiple rasters from one

I have a simple function splitting a raster object into three different classes. However my function doesn't return these rasters. I also read this tutorial http://cran.r-project.org/web/packages/raster/vignettes/functions.pdf
and according to it this is "a really bad way of doing this". However the 'right way' seems overly complicated. Is it really that there is no simple way of doing this (i.e., considering functions should make things easier for you not the vice versa).
I'm quite new to processing rasters with R so forgive me my stupid question..
rm(list=ls(all=T))
r <- raster(ncol=10, nrow=10)
r[] <- rnorm(100,100,5)
# Create split function // three classes
splitrast <- function(rast, quantile) {
print("Splitting raster...")
(q <- quantile(rast, probs=quantile))
r1 <- rast; r2 <- rast; r3 <- rast # copy raster three times
r1[rast > q[1]] <- NA #raster value less than .25 quantile
r2[rast <= q[1] | rast >= q[2]] <- NA #raster values is between quantiles
r3[rast < q[2]] <- NA #raster values is over .75 quantile
par(mfrow=c(1,3))
plot(r1);plot(r2);plot(r3)
rast <- brick(r1,r2,r3)
return(rast)
}
splitrast(r,c(0.2,0.8))
ls()
EDIT: reproducible example added
Don't try to return them separately. Instead return(list(r1,r2,r3)). But see comments about style.
The R raster subset function can help here. After you return the brick, you can subset each band as separate rasters.
# split the raster - returns a three band stack
rasters = splitrast(r,c(0.2,0.8))
# subset each band of the stack as a separate raster
r1 = subset(rasters, 1)
r2 = subset(rasters, 2)
r3 = subset(rasters, 3)
# proof - plot the separate rasters - same as those plotted in the function
plot(r1);plot(r2);plot(r3)

Raster grid position/coordinates of pixel(s) matching a value in R

Is there a way to extract the grid position or (preferably for rasters with an explicit extent) point/centroid coordinates of the pixels that match a particular value? I nearly have a pretty inefficient workflow converting to matrix and using which(mtrx == max(mtrx), arr.ind = TRUE) to get the matrix position(s), but this (a) loses geospatial information and (b) causes data to rotate 90 degrees in the matrix conversion process, both of which requiring extra code to make it work and slow the computations significantly. Is there an equivalent raster workflow anyone is aware of?
Example data:
library(raster)
set.seed(0)
r <- raster(ncols=10, nrows=10)
r[] <- sample(50, 100, replace=T)
Now do:
p <- rasterToPoints(r, function(x) x == 11)
To get
x y layer
[1,] 18 81 11
[2,] -126 63 11
[3,] -90 45 11
[4,] 54 -63 11
If you want the cell(s) with the maximum value
vmax = maxValue(r)
p <- rasterToPoints(r, function(x) all.equal(x, vmax)
(do not use #data#max)
I do not understand why you would coerce to a matrix? Perhaps I do not understand your question but, if I get you correctly, you could just query the raster values and then coerce to points to get the geographic position(s).
require(raster)
r <- raster(ncols=100, nrows=100)
r[] <- runif(ncell(r), 0,1)
# Coerce < max to NA and coerce result to points
rMax <- r
m = maxValue(r)
rMax[rMax != m] <- NA
( r.pts <- rasterToPoints (rMax) )
# You could also use the raster specific Which or which.max functions.
i <- which.max(r)
xy.max <- xyFromCell(r, i)
plot(r)
points(xy.max, pch=19, col="black")
# Or for a more general application of Which
i <- Which(r >= 0.85, cells=TRUE)
xy.max <- xyFromCell(r, i)
plot(r)
points(xy.max, pch=19, col="black")
# If you prefer a raster object set cells=FALSE
i <- Which(r >= 0.85, cells=FALSE)
plot(i)
There are multiple raster functions that will allow you to pass custom or base functions to them. You may want to take a look at "focal" which is a local operator or "calc" . You may want to also read through the help related to raster.
To extend Jeffrey's answer, you can select the last instance of the lowest raster value with the following:
r <- raster(ncols=12, nrows=12)
set.seed(0)
r[] <- round(runif(ncell(r))*0.7 )
rc <- clump(r)
rc[12,8]<-1
plot(rc)
xy.min<-data.frame(xyFromCell(rc,max(which.min(rc))))
xy.min$dat<-1
coordinates(xy.min)<-~x+y
points(xy.min,lwd=2)

Resources