I'm trying to run an overlay function with rasters where I want to meet all of 3 different conditions at each cell (using == and > or < operators) and produce a single raster as output.
Running ifelse with the & operator seems to look at the conditions in a linear fashion from left to right - If the first two conditions are met then it will produce the if condition as output, regardless of the third condition. && can't be used here because the result is not vectorized.
You can see this with this example below where with the resulting raster it's clear that it is not evaluating all three arguments. First clue is that it produces results even if some values are NA in the third raster.
I think I can get the result I want by first checking the condition of x and y and then with this result separately checking the condition of z with a different function, but I was hoping to be able to do it all in one function (seems like this should be possible, at least).
Hoping someone can point me in the right direction.
library(raster)
fn <- system.file("external/test.grd", package="raster")
s <- stack(fn, fn,fn)
#Create grids
s[[1]] <- round(runif(ncell(s), 1, 2))
s[[2]] <- round(runif(ncell(s), 1, 2))
s[[3]] <- round(runif(ncell(s), 1, 2))
#convert some values in s[[3]] to NA
s[[3]][s[[3]] == 1]<- NA
#run overlay function
result.rast <- overlay(s[[1]], s[[2]], s[[3]], fun =
function(x,y,z) {
ifelse( x == 2 & y == 1 & z ==2, 1, 0)
} )
I do not see evidence for the third condition not being used. NA values are a special case. See function f2 for some things you can do.
It is easier to see what's going on with a smaller raster
library(raster)
set.seed(0)
r <- raster(ncol=10, nrow=10, xmn=0, xmx=10, ymn=0, ymx=10)
r1 <- setValues(r, round(runif(ncell(r), 1, 2)))
r2 <- setValues(r, round(runif(ncell(r), 1, 2)))
r3 <- setValues(r, round(runif(ncell(r), 1, 2)))
r3[r3 == 1] <- NA
s <- stack(r1, r2, r3)
res1 <- overlay(s, fun =
function(x,y,z) {
ifelse( x == 2 & y == 1 & z ==2, 1, 0)
} )
#A more complex function, that keeps NAs
f2 <- function(x,y,z) {
a <- rep(0, length(x))
a[x == 2 & y == 1 & z ==2] <- 1
a[is.na(x) | is.na(y) | is.na(z)] <- NA
a
}
res2 <- overlay(s, fun = f2)
Did you try stackApply?
You can also use each layer of the raster stack as vectors.
Here is an example (it might be a better way to reference the cells in the rasterStack, though)
tt <- raster(ncol=4,nrow=5)
tt[] <- 1
tts <- stack(tt,tt,tt)
tts[[1]][4,2]<-NA
# now the condition
tt2 <- (tts[[1]] == 1 & tts[[2]] == 1 & tts[[3]] == 1)
plot(tt2)
Related
For each cell of a rasterStack, I want to find the most recent layer where the value exceeds a fixed threshold. Layers are stacked in chronological order, so this corresponds to the maximum index. Ultimately I want to know 1) the year of that layer (taken from the layer name), and 2) the value in that year.
I wrote a function to do this and was getting the wrong result. I modified the function in a way that I did not think would change it; I now get the correct result. My question is why these functions produce something different.
Set up the example:
library(raster)
library(rasterVis)
### Example raster stack:
set.seed(123123)
r1 <- raster(nrows = 10, ncols = 10)
r2 <- r3 <- r4 <- r1
r1[] <- rbinom(ncell(r1), 1, prob = .1)
r2[] <- rbinom(ncell(r1), 1, prob = .1)
r3[] <- rbinom(ncell(r1), 1, prob = .1)
r4[] <- rbinom(ncell(r1), 1, prob = .1)
rs <- stack(r1, r2, r3, r4)
names(rs) <- paste0("yr", 1:4)
These are the functions I would have thought would be the same...is there a reason not to have a vector as an intermediate step?
### Function to find index of last event:
# v1; this is wrong
findLast <- function(x) {
fire.ind <- ifelse(any(x > 0), which(x > 0), NA)
max.ind <- max(fire.ind)
}
# v2; this one gives correct answer
findLast2 <- function(x) {
max.ind <- ifelse(any(x > 0), max(which(x > 0)), NA)
}
testFind <- calc(rs, findLast)
freq(testFind)
testFind2 <- calc(rs, findLast2)
all.equal(testFind, testFind2)
Show the example inputs and the different results:
# plot:
s2 = stack(rs, testFind, testFind2)
levelplot(s2, pretty = TRUE)
Code to get the final layers I want:
### Most recent year:
nameFromInd <- function(x) {
yr <- as.integer(gsub(".*(\\d.*).*", "\\1", names(rs)[x]))
}
testYr <- calc(testFind2, nameFromInd)
### Value in most recent year:
testYrValue <- stackSelect(rs, testFind2)
Any insight into what I'm not seeing here? I haven't played around with alternatives that enhance speed, but any suggestions are welcome as I'll be doing this on a very large dataset.
sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
ifelse can be a bit surprising in that it returns a value with the same "shape" as the first argument, which has length one. So it returns the first value only of the result of which(x > 0). Always check the functions you use before using them with calc.
x <- c(-1:2, 1:-1)
ifelse(any(x > 0), which(x > 0), NA)
#[1] 3
ifelse(any(x > 0), max(which(x > 0)), NA)
#[1] 5
ifelse is a pretty involved function, and I think you can avoid it by doing:
r <- calc(rs, function(x) max(which(x > 0)))
y <- stackSelect(rs, r)
(and ignore the warnings). Or suppress them:
options(warn=-1)
r <- calc(rs, function(x) max(which(x > 0)))
options(warn=0)
testYrValue <- stackSelect(rs, r)
You can also combine what you do with calc and stackSelect like this
f1 <- function(x) {
if (any(x>0)) {
i <- max(which(x > 0))
cbind(i, x[i])
} else {
cbind(NA, NA)
}
}
rr1 <- calc(rs, f1)
Or the shortcut variant:
f2 <- function(x) {
i <- max(which(x > 0))
cbind(i, x[i])
}
rr2 <- calc(rs, f2)
Or
f3 <- function(x) {
i <- max(which(x > 0))
z <- cbind(i, x[i])
z[!is.finite(z)] <- NA
z
}
rr3 <- calc(rs, f3)
I'm trying to produce a binary raster layer based on a single conditional function applied over many layers. Here's an example I found of what I want to accomplish, however, this example is only for 3 layers. If I want to check for conditions over 25 layers, is there a way to do this that doesn't require 25 arguments input into the function?
library(raster)
set.seed(0)
r <- raster(ncol=10, nrow=10, xmn=0, xmx=10, ymn=0, ymx=10)
r1 <- setValues(r, round(runif(ncell(r), 1, 2)))
r2 <- setValues(r, round(runif(ncell(r), 1, 2)))
r3 <- setValues(r, round(runif(ncell(r), 1, 2)))
s <- stack(r1, r2, r3)
res1 <- overlay(s, fun =
function(x,y,z) {
ifelse( x == 2 | y == 2 | z ==2, 1, 0)
})
You can do:
r <- any(s == 2)
or
z <- calc(s, function(x) any(x==2))
i am experimenting with and R and I can't find the way to do the next thing:
1- I want to multiply if x == 3 multiply by "y" value of the same row
2- Add all computations done in step 1.
x <- 3426278722533992028364647392927338
y <- 7479550949037487987438746984798374
x <- as.numeric(strsplit(as.character(x), "")[[1]])
y <- as.numeric(strsplit(as.character(y), "")[[1]])
Table <- table(x,y)
Table <- data.frame(Table)
Table$Freq <- NULL
So I tried creating a function:
Calculation <- function (x,y) {
z <- if(x == 3){ x * y }
w <- sum(z)
}
x and y are the columns of the data.frame
This prints and error which I struggle to solve...
Thanks for your time,
Kylian Pattje
2 things here:
1. Use ifelse in your function,
Calculation <- function (x,y) {
z <- ifelse(x == 3, x * y, NA)
w <- sum(z, na.rm = TRUE)
return(w)
}
2. Make sure your variables are NOT factors,
Table[] <- lapply(Table, function(i) as.numeric(as.character(i)))
Calculation(Table$x, Table$y)
#[1] 84
I have two raster layers that I wish to combine into one. Let's call them mask (with values 1 and NA), and vrs.
library(raster)
mask <- raster(ncol=10, nrow=10)
mask[] <- c(rep(0, 50), rep(1, 50))
mask[mask < 0.5] <- NA
vrs <-raster(ncol=10, nrow=10)
vrs[] <- rpois(100, 2)
vrs[vrs >= 4] <- NA
I wish to combine two big layers, but for the sake of understanding these small examples are ok. What I wish to do is to set the pixel values of my output layer to zero for those pixels where mask layer is 1 and vrs layer is NA. All other pixels should remain with the values of original vrs.
This is my only thought as to how:
zero.for.NA <- function(x, y, filename){
out <- raster(y)
if(canProcessInMemory(out, n = 4)) { #wild guess..
val <- getValues(y) #values
NA.pos <- which(is.na(val)) #positiones for all NA-values in values-layer
NA.t.noll.pos<-which(x[NA.pos]==1) #Positions where mask is 1 within the
#vector of positions of NA values in vrs
val[NA.pos[NA.t.noll.pos]] <- 0 #set values layer to 0 where condition met
out <- setValues(out, val)
return(out)
} else { #for large rasters the same thing by chunks
bs <- blockSize(out)
out <- writeStart(out, filename, overwrite=TRUE)
for (i in 1:bs$n) {
v <- getValues(y, row=bs$row[i], nrows=bs$nrows[i])
xv <- getValues(x, row=bs$row[i], nrows=bs$nrows[i])
NA.pos <- which(is.na(v))
NA.t.noll.pos <- which(xv[NA.pos]==1)
v[NA.pos[NA.t.noll.pos]] <- 0
out <- writeValues(out, v, bs$row[i])
}
out <- writeStop(out)
return(out)
}
}
This function did work on the small example and seems to work on the bigger ones. Is there a faster/better way of doing this? Some way that is better for larger files? I will have to use this on many sets of layers and I would appreciate any help in making the process safer and or quicker!
I'd use cover():
r <- cover(vrs, mask-1)
plot(r)
You can do this with overlay, as well:
r <- overlay(mask, vrs, fun=function(x, y) ifelse(x==1 & is.na(y), 0, y))
I have a vector like this:
x<-c(-0.193,-0.126,-0.275,-0.375,-0.307,-0.347,-0.159,-0.268,-0.013,0.070,0.346,
0.376,0.471,0.512,0.291,0.554,0.185,0.209,0.057,0.058,-0.157,-0.291,-0.509,
-0.534,-0.239,-0.389,0.060,0.250,0.279,0.116,0.052,0.201,0.407,0.360,0.065,
-0.167,-0.572,-0.984,-1.044,-1.039,-0.831,-0.584,-0.425,-0.362,-0.154,0.207,
0.550,0.677,0.687,0.856,0.683,0.375,0.298,0.581,0.546,0.098,-0.081)
I would like to find the position of the lowest number each time >=5 consecutive values are <-0.5. In the example that is the value -1.044.
How do I find this?
What I have done is this:
xx<-ifelse(x>.5,1,NA)
xx
aa<-rle(xx)
zz <- rep(FALSE, length(xx))
zz[sequence(aa$lengths) == 1] <- aa$lengths >= 5 & aa$values == 1
zz
But then I just find the position of the first value and not the extreme.
Any help?
Thanks for posting what you've tried.
I'd just use a logical comparison for xx:
xx <- x < -0.5
Then your rle logic becomes:
aa <- rle(xx)
zz <- aa$lengths >= 5 & aa$values
From there, identify which values of zz are true and use cumsum to get the indicies of x (this is oversimplified since there is only once instance but you get the picture):
first <- which(zz)
idxs <- cumsum(aa$lengths[1:first])
min(x[idxs[first-1]:idxs[first]])
In the instance where you have multiple matches, first will be a vector with length > 1. In that case, make a function and you can apply it to your vector:
myfun <- function(y) {
idxs <- c(0, cumsum(aa$lengths[1:y]))
min(x[idxs[y]:idxs[y+1]])
}
set.seed(20)
x <- rnorm(100)
xx <- x < -0.5
aa <- rle(xx)
zz <- aa$lengths >= 3 & aa$values
first <- which(zz)
sapply(first, myfun)
A function with the apply function inside:
find.val <- function(x,threshold,n,all=T){
tmp <- rle(x < threshold)
cs <- cumsum(tmp$lengths)
dfcs <- data.frame(indices=c(0,cs[-length(cs)])+1,l=cs)
pos <- (apply(dfcs,1,function(y) which.min(x[y[1]:y[2]])+y[1]-1))[tmp$values==1 & tmp$lengths >= n]
if(all==T) return(pos)
pos[which.min(x[pos])]
}
if you set all=T you get all matches otherwise only the position of the lowest match.
Example:
find.val(x,-0.5,5,all=T)