Modifying which.max function in R

Modifying which.max function in R - r

I have two rasters with data as below:
library("raster")
mdata <- raster(matrix(c(0,2,3, 11,12,13), nrow = 2, ncol = 3, byrow = TRUE))
ndata <- raster(matrix(c(0,1,2, 11,14,13), nrow = 2, ncol = 3, byrow = TRUE))
I want to stack them and estimate the maximum position with the criteria that if both the raster has value of 0, I should be able to write 0. This means that the output raster/matrix should have either 0, 1 or 2 in this case.
I tried following codes but it does not perform quite exactly the way I want.
odata <- stack(mdata, ndata)
e <- which.max(odata)
How should I be able to introduce the criteria that checks if both matrices have value of 0 for same position and assign 0 if there is?
I really appreciate your feedback on this. Thanks!

How about this:
Rgames> foo<-matrix(rep(1,6),nr=2,nc=3)
Rgames> foo
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 1 1
Rgames> foo[(ndata-mdata >0)] <-2
Rgames> foo[(mdata==0 & ndata==0)] <-0
Rgames> foo
[,1] [,2] [,3]
[1,] 0 1 1
[2,] 1 2 1
I still can't tell from your question whether you want to locate the maximum of mdata or all locations where mdata>ndata , but if you clarify that it's easy to modify the conditionals to match.
EDIT: discuss extending to N layers of a raster.
To find which layer has the max at each location, I might use an array.
cube <- array(c(data.1,data.2,...),dim=c(dim(data.1),N)) #for N layers
# and I apologize but I may have loaded this 3-D cube in the wrong order
maxvals<-array(dim=dim(data.1))
for (j in 1:dim(data.1)[1]) {
for (k in 1:dim(data.1)[2]) {
maxvals[j,k]<-which.max(cube[j,k,])
if(sum(cube[j,k,])==0 ) maxvals[j,k] <- 0
}
}
This can probably be done a lot more efficiently with aggregate or plyr tools but I hope this points the way.

Related

How to add an element to a 3d array?

I am trying to add elements into a multidimensional array, that I don't know the length of when it is declared, but it is not behaving as i would expect. I have seen that arrays in r are not static, and I have tried a bunch of different approaches, so I'm starting to go a little mad.
i have the array:
diffAr <- array(0,dim = c(1,2))
that i set as part of a loop:
diffAr[t,] <- c(cordX,cordY)
which works fine
> diffAr
[,1] [,2]
[1,] 2 1
and the 1st go round this works fine, but on the second I get this error:
Error in '[<-'('tmp', 2, , value = c(3, 5)) : subscript out of bounds
if arrays in r are dynamic, shouldn't this work, and just add another element?

If you want to add rows to an array you can use rbind() and if you want to add columns you can use cbind(). In this case adding a second row with values 3 and 5 can be done like this (this will also work in your loop) :
diffAr <- array(0,dim = c(1,2))
diffAr[1,] <- c(2,1)
diffAr <- rbind(diffAr, c(3,5))
diffAr
> diffAr
[,1] [,2]
[1,] 2 1
[2,] 3 5

It's since you only have one row in your array.
here is what you set up:
diffAr <- array(0,dim = c(1,2))
which looks like this:
[,1] [,2]
[1,] 0 0
then, let's say you want to change it (in a loop or otherwise):
cordX <- 2
cordY <- 1
diffAr[1,] <- c(cordX,cordY)
now it looks like:
[,1] [,2]
[1,] 2 1
but if you (or your loop) asked for this (which is asking to put the new values in the second row, which doesn't exist):
diffAr[2,] <- c(cordX,cordY)
then you get the error since that row doesn't exist:
Error in `[<-`(`*tmp*`, 2, , value = c(2, 1)) : subscript out of bounds
but you could keep changing the first row if you wanted:
diffAr[1,] <- c(3,5)
which works...
[,1] [,2]
[1,] 3 5

Optim() making Paremeters equal to each other in R

I have a matrix where I would like to estimate 4 values:
M = [ var1 corr 0]
[ corr var2 0]
[ 0 0 0]
Basically, the matrix has two variances and a correlation in the position M[1,2] and M[2,1] which should be equal. To this end, I wrote a function for optim as follows:
M <- matrix(c(NA,NA,0,NA,NA,0,0,0,0),3,3,byrow=TRUE)
objf <- function(pars, model, estimate = TRUE) {
model$M[is.na(model$M)] <- pars[1:4]
if (estimate) {
-logLik(model)
} else {
model
}
}
However, when I run my code the output for the two correlation values is different, so I was wondering if there is a better way to write my code such that I have M[1,2] = M[2,1]. I think using the line model$M[is.na(model$M)] is the issue but I'm not sure how to write the code better, so if anyone could help me out I'd appreciate it, thanks!

You could change your objective function to have only 3 parameters and replace
model$M[is.na(model$M)] <- pars[1:4]
with
model$M[is.na(model$M)] <- pars[c(1,2,2,3)]
An illustration
M <- matrix(c(NA,NA,0,
NA,NA,0,
0, 0,0), 3, 3, byrow=TRUE)
pars <- 1:3
M[is.na(M)] <- pars[c(1,2,2,3)]
M
[,1] [,2] [,3]
[1,] 1 2 0
[2,] 2 3 0
[3,] 0 0 0

downsampling a matrix in R

I have a matrix Q that is relatively high dimensional (100X500000), and I want to downsample it. By downsample, I will explain with an example.
Let Q =
1 4 9
3 2 1
and downsample size= n. I want to draw n balls from a jar of sum(Q) = 20 balls, each ball colored 1 of 6 ways corresponding to a different index pair of the matrix. It's like I have 1 ball of color A, 4 balls of color B, etc, and I'm drawing n balls without replacement.
I want it to be returned in the same format, as a matrix. One example return value, for example, downsample(Q, 3) =
0 0 2
1 0 0
My approach is trying to use sample:
sample(length(as.vector(Q)), size=n, replace=FALSE, prob = as.vector(Q))
However the problem with this is, sample considers 1:length(as.vector(Q)) as all the balls I have, so I can't draw more than length(as.vector(Q)) balls since I'm not replacing my balls.
So then to adapt my method, I would need to update my prob by subtracting 1 from this vector, and call sample one by one using a for loop of some sort. It doesn't sound like nice code.
Is there a better way to do this in a R-friendly, no for loop way?

It's a little inefficient, but if sum(Q) isn't too large you can do this by disaggregating/replicating the vector and then sampling, then reaggregating/tabulating.
Q <- setNames(c(1,4,9,3,2,1),LETTERS[1:6])
n <- 10
set.seed(101)
s0 <- sample(rep(names(Q),Q),
size=n,replace=FALSE)
Q2 <- table(factor(s0,levels=names(Q)))
## A B C D E F
## 1 2 5 1 0 1
I'm not sure about your matrix structure. You could use dim(Q2) <- dim(Q) to reorganize the results in the same order as your original matrix ...

Here's one way that's pretty good. You could improve its efficiency (if necessary) by replacing which(x <= cq)[1] with a function special-built for finding the first TRUE value.
Q = matrix(c(1, 4, 9, 3, 2, 1), nrow = 2)
set.seed(47)
samp = sample(sum(Q), size = 3)
cq = cumsum(Q)
inds = table(sapply(samp, function(x) which(x <= cq)[1]))
result = integer(length(Q))
result[as.integer(names(inds))] = inds
dim(result) = dim(Q)
# [,1] [,2] [,3]
# [1,] 0 2 0
# [2,] 0 0 1

replace diagonal elements in an array

Does anyone know a neat/efficient way to replace diagonal elements in array, similar to the use of diag(x) <- value for a matrix? In other words something like this:
> m<-array(1:27,c(3,3,3))
> for(k in 1:3){
+ diag(m[,,k])<-5
+ }
> m
, , 1
[,1] [,2] [,3]
[1,] 5 4 7
[2,] 2 5 8
[3,] 3 6 5
, , 2
[,1] [,2] [,3]
[1,] 5 13 16
[2,] 11 5 17
[3,] 12 15 5
, , 3
[,1] [,2] [,3]
[1,] 5 22 25
[2,] 20 5 26
[3,] 21 24 5
but without the use of a for loop (my arrays are pretty large and this manipulation will already be within a loop).
Many thanks.

Try this:
with(expand.grid(a = 1:3, b = 1:3), replace(m, cbind(a, a, b), 5))
EDIT:
The question asked for neat/efficient but, of course, those are not the same thing. The one liner here is compact and loop-free but if you are looking for speed I think you will find that the loop in the question is actually the fastest of all the answers.

You can use the following function for that, provided you have only 3 dimensions in your array. You can generalize to more dimensions based on this code, but I'm too lazy to do that for you ;-)
`arraydiag<-` <- function(x,value){
dims <- dim(x)
id <- seq_len(dims[1]) +
dims[2]*(seq_len(dims[2])-1)
id <- outer(id,(seq_len(dims[3])-1)*prod(dims[1:2]),`+`)
x[id] <- value
dim(x) <- dims
x
}
This works like :
m<-array(1:36,c(3,3,4))
arraydiag(m)<-NA
m
Note that, contrary to the diag() function, this function cannot deal with matrices that are not square. You can look at the source code of diag() to find out how to adapt this code in order it does so.

diagArr <-
function (dim)
{
n <- dim[2]
if(dim[1] != n) stop("expecting first two dimensions to be equal")
d <- seq(1, n*n, by=n+1)
as.vector(outer(d, seq(0, by=n*n, length=prod(dim[-1:-2])), "+"))
}
m[diagArr(dim(m))] <- 5
This is written with the intention that it works for dimensions higher than 3 but I haven't tested it in that case. Should be okay though.

Sampling from a contingency table

I've managed as far as the code below in writing a function to sample from a contingency table - proportional to the frequencies in the cells.
It uses expand.grid and then table to get back to the original size table. Which works fine as long as the sample size is large enough that some categories are not completely missing. Otherwise the table command returns a table that is of smaller dimensions than the original one.
FunSample<- function(Full, n) {
Frame <- expand.grid(lapply(dim(Full), seq))
table(Frame[sample(1:nrow(Frame), n, prob = Full, replace = TRUE), ])
}
Full<-array(c(1,2,3,4), dim=c(2,2,2))
FunSample(Full, 100) # OK
FunSample(Full, 1) # not OK, I want it to still have dim=c(2,2,2)!
My brain has stopped working, I know it has to be a small tweak to get it back on track!?

A crosstab is also a multinomial distribution, so you can use rmultinom and reset the dimension on the output. This should give a substantial performance boost and cut down on the code you need to maintain.
> X <- rmultinom(1, 500, Full)
> dim(X) <- dim(Full)
> X
, , 1
[,1] [,2]
[1,] 18 92
[2,] 45 92
, , 2
[,1] [,2]
[1,] 28 72
[2,] 49 104
> X2 <-rmultinom(1, 4, Full)
> dim(X2) <- dim(Full)
> X2
, , 1
[,1] [,2]
[1,] 0 1
[2,] 0 0
, , 2
[,1] [,2]
[1,] 0 1
[2,] 1 1

If you don't want table() to "drop" missing combinations, you need to force the columns of Frame to be factors:
FunSample <- function(Full, n) {
Frame <- as.data.frame( lapply( expand.grid(lapply(dim(Full), seq)), factor) )
table( Frame[sample(1:nrow(Frame), n, prob = Full, replace = TRUE), ])
}
> dim( FunSample(Full, 1))
[1] 2 2 2
> dim( FunSample(Full, 100))
[1] 2 2 2

You could use tabulate instead of table; it works on integer-valued vectors, as you have here. You could also get the output into an array by using array directly, just like when you created the original data.
FunSample<- function(Full, n) {
samp <- sample(1:length(Full), n, prob = Full, replace = TRUE)
array(tabulate(samp), dim=dim(Full))
}

Categories

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Modifying which.max function in R - r

Related

How to add an element to a 3d array?

Optim() making Paremeters equal to each other in R

downsampling a matrix in R

replace diagonal elements in an array

Sampling from a contingency table

Categories

Resources