Conditional replacement of values in an array - r

I want to modify an array but with an element-by-element condition. This is what I want to do
vector <- runif(18, 0,1)
xx <- array(vector, dim=c(2,3,3))
for (i in 1:2) {
for (j in 1:3) {
xx[i,j,1] <- ifelse(xx[i,j,1]<0.5,1,xx[i,j,1])
xx[i,j,2] <- ifelse(xx[i,j,2]<0.4,1.5,xx[i,j,2])
xx[i,j,3] <- ifelse(xx[i,j,3]<0.2,2,xx[i,j,3])
}
}
Is there a more efficient way to do it?
Thanks

Not sure what you mean by efficient but this avoids looping:
vector <- runif(18, 0,1)
xx <- array(vector, dim=c(2,3,3))
xx
xx[,,1][xx[,,1]<.5] <- 1
xx[,,2][xx[,,2]<.4] <- 1.5
xx[,,3][xx[,,3]<.2] <- 2
Try it online!

There are two ways that you could simplify this double loop
Option 1:
vector <- runif(18, 0,1)
xx <- array(vector, dim=c(2,3,3))
xx[,,1][xx[,,1]<.5] = 1
xx[,,2][xx[,,2]<.4] = 1.5
xx[,,3][xx[,,3]<.2] = 2
You still have to write one line for each condition, though.
The second way is to use lapply, but in this case you have to create three vectors: index, threshhold, substitution
idx = 1:3
thr = c(.5, .4, .2)
sb = c(1, 1.5, 2)
lapply(idx, function(k){
xx[,,k][ xx[,,k]< thr[x] ] <<- sb[k]
})

Related

I am trying to create a script that runs consecutive prop.test( ) for consecutive values using a for Loop

Script:
a <- c(10, 20)
b <- c(100, 200)
c <- c(50 , 1000)
d <- c(3000, 4300)
for (i in c(a,b,c,d))
{
print(prop.test(a,b))
}.
So essentially I want every 2 objects to be paired up. I hope I am somewhat clear.
You can put the vectors in a list and use a for loop as follows -
list_data <- list(a, b, c, d)
result <- vector('list', length(list_data)/2)
for(i in seq_along(result)) {
n <- (i -1) * 2 + 1
result[[i]] <- prop.test(list_data[[n]], list_data[[n+1]])
print(result[[i]])
}

For in function in R for different groups of rows

I have the following R objects:
y <- sample(c(0,2,2),1000,replace=T)
X <- matrix(runif(2000,0,2),ncol=2,byrow=T)*2
XX = t(X)%*%X
XY = as.numeric(t(X)%*%y)
YY = as.numeric(t(y)%*%y)
How can I run it to get several XX, XY and YY objects calculated with the first 10 rows, others with the rows 11 to 20, etc...?? Any ideas with a for in loop?
Thank you!
We can create lists to store different outputs. Create a sequence with a step of 10 and calculate the result in for loop.
len <- length(y)/10
XX_list <- vector('list', len)
XY_list <- vector('list', len)
YY_list <- vector('list', len)
vals <- seq(1, length(y), 10)
for(i in seq_along(vals)) {
inds <- vals[i]:(vals[i] + 9)
XX_list[[i]] <- t(X[inds, ]) %*% X[inds, ]
XY_list[[i]] = as.numeric(t(X[inds, ])%*% y[inds])
YY_list[[i]] = as.numeric(t(y[inds])%*% y[inds])
}

Vectorize double loops in R

I am new to R and am trying to vectorize my codes below.
What is a better way to do this? Thanks so much!
*
l_mat <- data.frame(matrix(ncol = 4, nrow = 4))
datax <- data.frame("var1"= c(1,1,1,1), "Var2" = c(2,2,2,2), "Var3"=c(3,3,3,3), "Var4"=c(4,4,4,4))
for (i in 1:4) {
for (j in 1:4) {
if (datax[i, 2] == datax[j, 2]) {
l_mat[i, j] <- 100
} else {
l_mat[i, j] <- 1
}
}
}
*
It can be better done with outer. As we are checking if all the values in the second column against itself, create the logical expression with outer, convert it to a numeric index and then replace the values with 1 or 100
out <- 1 + (outer(datax[,2], datax[,2], `==`))
out[] <- c(1, 100)[out]
Or in a single line
ifelse(outer(datax[,2], datax[,2], `==`), 100, 1)
Or use a variation with pmax and outer
do.call(pmax, list(outer(datax[,2], datax[,2], `==`) * 100, 1))

how to append an element to a list without keeping track of the index?

I am looking for the r equivalent of this simple code in python
mylist = []
for this in that:
df = 1
mylist.append(df)
basically just creating an empty list, and then adding the objects created within the loop to it.
I only saw R solutions where one has to specify the index of the new element (say mylist[[i]] <- df), thus requiring to create an index i in the loop.
Is there any simpler way than that to just append after the last element.
There is a function called append:
ans <- list()
for (i in 1992:1994){
n <- 1 #whatever the function is
ans <- append(ans, n)
}
ans
## [[1]]
## [1] 1
##
## [[2]]
## [1] 1
##
## [[3]]
## [1] 1
##
Note: Using apply functions instead of a for loop is better (not necessarily faster) but it depends on the actual purpose of your loop.
Answering OP's comment: About using ggplot2 and saving plots to a list, something like this would be more efficient:
plotlist <- lapply(seq(2,4), function(i) {
require(ggplot2)
dat <- mtcars[mtcars$cyl == 2 * i,]
ggplot() + geom_point(data = dat ,aes(x=cyl,y=mpg))
})
Thanks to #Wen for sharing Comparison of c() and append() functions:
Concatenation (c) is pretty fast, but append is even faster and therefor preferable when concatenating just two vectors.
There is: mylist <- c(mylist, df) but that's usually not the recommended way in R. Depending on what you're trying to achieve, lapply() is often a better option.
mylist <- list()
for (i in 1:100){
n <- 1
mylist[[(length(mylist) +1)]] <- n
}
This seems to me the faster solution.
x <- 1:1000
aa <- microbenchmark({xx <- list(); for(i in x) {xx <- append(xx, values = i)} })
bb <- microbenchmark({xx <- list(); for(i in x) {xx <- c(xx, i)} } )
cc <- microbenchmark({xx <- list(); for(i in x) {xx[(length(xx) + 1)] <- i} } )
sapply(list(aa, bb, cc), (function(i){ median(i[["time"]]) / 10e5 }))
#{append}=4.466634 #{c}=3.185096 #{this.one}=2.925718
mylist <- list()
for (i in 1:100) {
df <- 1
mylist <- c(mylist, df)
}
Use
first_list = list(a=0,b=1)
newlist = c(first_list,list(c=2,d=3))
print(newlist)
$a
[1] 0
$b
[1] 1
$c
[1] 2
$d
[1] 3
Here's an example:
glmnet_params = list(family="binomial", alpha = 1,
type.measure = "auc",nfolds = 3, thresh = 1e-4, maxit = 1e3)
Now:
glmnet_classifier = do.call("cv.glmnet",
c(list(x = dtm_train, y = train$target), glmnet_params))

Fisher test using apply function in R

The following is the code: the problem is that the calculation is very slow.
The matrices, gene1, gene2 and neither are of same length (8000)
pos <- c()
neg <- c()
either <- c()
for(i in 1:ncol(both)){
x <- cbind(both[,i], gene1[,i], gene2[,i], neither[,i])
test <- apply(x, 1, function(s){fisher.test(matrix(s, nrow = 2),
alternative = "greater")$p.value})
pos <- c(test,pos)
test1 <- apply(x, 1, function(s){fisher.test(matrix(s, nrow = 2),
alternative = "less")$p.value})
neg <- c(test1, neg)
test2 <- apply(x, 1, function(s){fisher.test(matrix(s, nrow = 2))$p.value})
either <- c(test2, either)
}
You can try using lapply to loop over the different alternatives (less, greater, two.sided) and wrap the fisher.test call in your own function. Perhaps something like this:
myTest <- function(altn,x){
ft <- apply(x,1,FUN=function(s,alt) {
fisher.test(matrix(s,nrow=2),alternative=alt)$p.value},
alt=altn)
}
pos <- c()
neg <- c()
either <- c()
for(i in 1:ncol(both)){
x <- cbind(both[,i], gene1[,i], gene2[,i], neither[,i])
rs <- lapply(c('two.sided','greater','less'),myTest,x=x)
pos <- c(rs[[2]],pos)
neg <- c(rs[[3]],neg)
either <- c(rs[[1]],either)
}
Without some test data to check on, I can't assure you there won't be any gotcha's in this, but this basic strategy should do what you want.
Note that this still calls fisher.test three times, just in a somewhat more compact form. I don't know of a function that calculates a fisher test with all three alternatives in the same call, but perhaps someone else will weigh in with one.

Resources