The terra package has the aggregate function that allows to create a new SpatRaster with a lower resolution (larger cells) but needs the fact parameter.
When converting many rasters, fact needs to be calculated each time, is there a way to pass the fact parameter based on the target resolution of another raster? Other functions take an existing raster as input, such as function(r1,r2)
library(terra)
r1 <- rast(ncol=10,nrow=10)
r2 <- rast(ncol=4,nrow=4)
values(r1) <- runif(ncell(r1))
values(r2) <- runif(ncell(r2))
I have tried
r3 = aggregate(r1,fact=res(r1)/res(r2))
Error: [aggregate] values in argument 'fact' should be > 0
Found the answer, I had the res(r1)/res(r2) inverted, it should be
r3 = aggregate(r1,fact=res(r2)/res(r1))
Still it would be much better just to pass the name of the target's raster resolution.
You cannot aggregate r2 to r1 because res(r2)/res(r1) does not return whole numbers.
res(r2)/res(r1)
#[1] 2.5 2.5
More generally, you cannot assume that you can aggregate one raster to another, so having another raster as second argument is not as obvious as with other methods such as resample.
In this special case you can do
aggregate(disagg(r1, 2), 5)
Related
I am trying to calculate the square of the difference between a raster cell i and its neighbors js (i.e.,(j-i)^2) in a 3 x 3 neighborhood, and then calculate the mean value of those differences and assign that result to cell i.
I found this answer, given by Forrest R. Stevens, that comes close to what I want to achieve, but I have only one raster (not a stack) with 136710 cells (1 089 130 combinations with the adjacent function), so a for loop is taking forever.
I want to use the function focal from the raster package, so the for loop is only run for the 3x3 matrix, but it is not working for me.
Here is an example using Forrest R. Stevens' code I mentioned above:
r <- raster(matrix(1:25,nrow=5))
r[] <-c(2,3,2,3,2,
3,2,3,2,NA,
NA,3,2,3,2,
NA,2,3,2,3,
2,3,2,3,NA)
## Calculate adjacent raster cells for each focal cell:
a <- raster::adjacent(r, cell=1:ncell(r), directions=8, sorted=T)
# Function
sq_dff<- function(w){
## Create column to store calculation:
out <- data.frame(a)
out$sqrd_diff <- NA
## Loop over all focal cells and their adjacencies,
## extract the values across all layers and calculate
## the squared difference, storing it in the appropriate row of
## our output data.frame:
cores <- 8
beginCluster(cores, type='SOCK')
for (i in 1:nrow(a)) {
print(i)
out$sqrd_diff[i] <- (r[a[i,2]]- r[a[i,1]])^2
print(Sys.time())
}
endCluster()
## Take the mean of the squared differences by focal cell ID:
r_out_vals <- aggregate(out$sqrd_diff, by=list(out$from), FUN=mean,na.rm=T)
names(r_out_vals)<- c('cell_numb','value')
return(r_out_vals$value)
}
r1 <- focal(x=r, w=matrix(1,3,3), fun=sq_dff)
The function works well if I apply it like this:
r1 <-sq_dff(r), and using #r_out <- r[[1]]; #r_out[] <- r_out_vals$value; return(r_out) (as suggested by. Forrest R. Stevens in his answer) instead of return(r_out_vals$value)
But, when I apply it inside the focal function as written above, it returns a raster with values for only the nine cells in the center and all of them with the same value of 0.67 assigned.
Thanks!
You could try this:
library(terra)
r <- rast(matrix(1:25,nrow=5))
r[] <-c(2,3,2,3,2,
3,2,3,2,NA,
NA,3,2,3,2,
NA,2,3,2,3,
2,3,2,3,NA)
f <- function(x) {
mean((x[-5] - x[5])^2, na.rm=TRUE)
}
rr <- focal(r, 3 ,f)
plot(rr)
text(rr, dig=2)
So I have been working through a population ecology exercise using the popbio package in R-Studio that focuses on using Leslie Matrix's. I have successfully created a Leslie matrix with the proper dimensions using the Fecundity (mx) and Annual Survival values (sx) that I have calculated with my life table. I then am trying to use the pop.projection function in the popbio package to multiply my Leslie matrix (les.mat) by a starting population vector (N0) followed by the number of time intervals (4 years). It is my understanding that you should be able to take a Leslie matrix and multiply by a population vector to calculate a population size after a set number of time intervals. Have I done something wrong here, when I try to run my pop.projection line of code I get the following error message in R:
"> projA <- pop.projection(les.mat,N0,10)
Error in A %*% n : non-conformable arguments"
Could the problem be an issue with my pop.projection function? I am thinking it may be an issue with by N0 argument (population vector), when I look at my N0 values it seems like it has been saved in R as a "Numeric Type", should I be converting it into its own matrix, or as it's own vector somehow to get my pop.projection line of code to run? Any advice would be greatly appreciated, the short code I have been using will be linked below!
Sx <- c(0.8,0.8,0.7969,0.6078,0.3226,0)
mx <- c(0,0,0.6,1.09,0.2,0)
Fx <- mx # fecundity values
S <- Sx # dropping the first value
F <- Fx
les.mat <- matrix(rep(0,36),nrow=6)
les.mat[1,] <- F
les.mat
for(i in 1:5){
les.mat[(i+1),i] <- S[i]
}
les.mat
N0 <- c(100,80,64,51,31,10,0)
projA <- pop.projection(les.mat,N0,10)
The function uses matrix multiplication on the first and second arguments so they must match. The les.mat matrix is 6x6, but N0 is length 7. Try
projA <- pop.projection(les.mat, N0[-7], 10) # Delete last value
or
projA <- pop.projection(les.mat, N0[-1], 10) # Delete first value
I am working with a dataset that consists of 20 layers, stacked in a RasterBrick (originating from an array). I have looked into the sum of the layers, calculated with both 'calc' and 'cellStats'. I have used calc to calculate the sum of the total values and cellStats to look at the average of the values per layer (useful for a time series).
However, when I sum the average of each layer, it is half the value of the other calculated sum. What causes this difference? What am I overlooking?
Code looks like this:
testarray <- runif(54214776,0,1)
# Although testarray should contain a raster of 127x147 with 2904 time layers.
# Not really sure how to create that yet.
for (i in 1830:1849){
slice<-array2[,,i]
r <- raster(nrow=(127*5), ncol=(147*5), resolution =5, ext=ext1, vals=slice)
x <- stack(x , r)
}
brickhp2 <- brick(x)
r_sumhp2 <- calc(brickhp2, sum, na.rm=TRUE)
r_sumhp2[r_sumhp2<= 0] <- NA
SWEavgpertimestepM <- cellStats(brickhp2, stat='mean', na.rm=TRUE)
The goal is to compare the sum of the layers calculated with 'calc(x, sum)' with the sum of the mean calculated with 'cellStats(x, mean)'.
Rasterbrick looks like this (600kb, GTiff) : http://www.filedropper.com/brickhp2
*If there is a better way to share this, please let me know.
The confusion comes as you are using calc which operates pixel-wise on a brick (i.e. performs the calculation on the 20 values at each pixel and returns a single raster layer) and cellStats which performs the calculation on each raster layer individually and returns a single values for each layer. You can see that the results are comparable if you use this code:
library(raster)
##set seed so you get the same runif vals
set.seed(999)
##create example rasters
ls=list()
for (i in 1:20){
r <- raster(nrow=(127*5), ncol=(147*5), vals=runif(127*5*147*5))
ls[[i]] <- r
}
##create raster brick
brickhp2 <- brick(ls)
##calc sum (pixel-wise)
r_sumhp2 <- calc(brickhp2, sum, na.rm=TRUE)
r_sumhp2 ##returns raster layer
##calc mean (layer-wise)
r_meanhp2 <- cellStats(brickhp2, stat='mean', na.rm=TRUE)
r_meanhp2 ##returns vector of length nlayers(brickhp2)
##to get equivalent values you need to divide r_sumhp2 by the number of layers
##and then calculate the mean
cellStats(r_sumhp2/nlayers(brickhp2),stat="mean")
[1] 0.4999381
##and for r_meanhp2 you need to calculate the mean of the means
mean(r_meanhp2)
[1] 0.4999381
You will need to determine for yourself if you want to use the pixel or layer wise result for your application.
I want to extract the precise mean value of raster values from an area extent defined by a polygon in r. This works using raster::extract with the option weights=TRUE. However, this operation becomes prohibitively slow with large rasters and the function doesn't seem to be parallelized, thus beginCluster() ... endCluster() does not speed up the process.
I need to extract the values for a range of rasters, exemplified here as r, r10 and r100. Is there a way to speed this up in r, or is there an alternative way of doing this in GDAL?
r <- raster(nrow=1000, ncol=1000, vals=sample(seq(0,0.8,0.01),1000000,replace=TRUE))
r10 <- aggregate(r, fact=10)
r100 <- aggregate(r, fact=100)
v = Polygons(list(Polygon(cbind(c(-100,100,80,-120), c(-70,0,70,0)))), ID = "a")
v = SpatialPolygons(list(v))
plot(r)
plot(r10)
plot(r100)
plot(v, add=T)
system.time({
precise.mean <- raster::extract(r100, v, method="simple",weights=T, normalizeWeights=T, fun=mean)
})
user system elapsed
0.251 0.000 0.253
> precise.mean
[,1]
[1,] 0.3994278
system.time({
precise.mean <- raster::extract(r10, v, method="simple",weights=T, normalizeWeights=T, fun=mean)
})
user system elapsed
7.447 0.000 7.446
precise.mean
[,1]
[1,] 0.3995429
In the end I resorted the problem using gdalUtils working directly on the harddisk.
I used the command gdalwarp() to reduce the raster resolution to r10, 100.
Then gdalwarp() to increase the resolution of the resulting raster to the original resolution of r.
Then gdalwarp() with cutline= "v.shp", crop_to_cutline =T to mask the raster to the vector v.
And then gdalinfo() combined with subset(x(grep("Mean=",x))) to extract the mean values.
All of this was packed in a foreach() %dopar% loop to process a number of rasters and resolution.
While complicated and probably not as precise as extract::raster, it did the job.
It should actually run faster if you first call beginCluster (the function then deals with the parallelization). Even better would be to use version 2.7-14 which has a much faster implementation. It is currently under review at CRAN, but you can also get it here: https://github.com/rspatial/raster
I have a question which I assume can be generic, but in my case it is applicable to neural network in R.
For the record I am using both h20 and neuralnet packages.
Since you may know, often, it is advised to scale he input of a neural network, in order to make the NN itself work better with the specific used activation function.
In R to do this there are several ways and I do use scale () / min / max.
Let's pretend that I have a matrix of 700x10 as input so the scaling will produce me two vectors scaled and center of carnality 10.
Now the problem starts when I want to unscale the output.
The formula sayy vOutput * vScaled (full vector) + vCenter (full vector).
Question: Should I use then all the vectors (scaled and Center) in order to the unscaling? or there is a more complex formula or boundaries that I could not find?
#sample data
df <- data.frame(col1 = c(1:5), col2 = c(11:15), target=c(1,0,0,0,1))
#normalize sample data using scale() - except the 'target' column
df_scaled <- scale(df[,-ncol(df)])
df_scaled
#revert back to original data from scaled version
df_original <- as.data.frame(t(apply(df_scaled, 1,
function(x) (x * attr(df_scaled, 'scaled:scale') + attr(df_scaled, 'scaled:center')))))
df_original