I have a function that requires 4 parameters:
myFun <- function(a,b,c,d){}
I have a matrix where each row contains the parameters:
myMatrix = matrix(c(a1,a2,b1,b2,c1,c2,d1,d2), nrow=2, ncol=4)
Currently I have a loop which feeds the parameters to myFun:
m <- myMatrix
i <- 1
someVector <- c()
while (i<(length(m[,1])+1)){
someVector[i] <-
myFun(m[i,1],m[i,2],m[i,3],m[i,4])
i = i+1
}
print(someVector)
What I would like to know is there a better way to get this same result using sapply instead of a loop.
You can use mapply() here which allows you to give it vectors as arguments, you should turn your matrix into a dataframe.
df <- as.data.frame(myMatrix))
results <- mapply(myFun, df$a, df$b, df$c, df$d)
Related
If I want to have variables with numbers accessible for example in a for loop I can use get and assign:
for(i in 1:2){
assign(paste0('a',toString(i)),i*pi)
}
get('a2')
output
[1] 6.283185
But what if I want to do something similar for a dataframe?
I would like to do something like
df<-data.frame(matrix(ncol = 2,nrow = 3))
varnames <- c()
for(i in 1:2){
varnames <- c(varnames, paste0('a', toString(i)))
}
colnames(df) <- varnames
for(i in 1:2){
assign(paste0('df$a',toString(i)), rep(i*pi,3))
}
get(paste0('df$a',toString(2)))
But this actually just creates variables called df$a1, df$a2 instead of assigning c(i*pi,i*pi,i*pi) to the columns of the dataframe df
And what I really want to do is to be able manipulate whole columns (individual entries) like this:
for(i in 1:2){
for(j in 1:3)
assign(paste0('df$a',toString(i),'[',toString(j),']'), i*pi)
}
get(paste0('df$a',toString(2),'[2]'))
where I would be able to get df$a2[2].
I think something like a python dictionary would work too.
Instead of assign, just directly do the [
for(i in 1:2) df[[paste0('a', i)]] <- rep(i * pi, 3)
and then can get the value with
df[[paste0('a', 2)]][2]
[1] 6.283185
assign can be used, but it is not recommended when we have do this more directly
for(i in 1:2) assign("df",`[[<-`(df, paste0('a', i), value = i * pi))
df[[paste0('a', 2)]][1]
[1] 6.283185
The get should be on the object i.e. 'df' instead of the columns i.e.
get('df')[[paste0('a', 2)]][1]
First of all, it is not generally a great idea to use assign to create objects in the global environment. In preference, you should be creating a named list instead for all sorts of good reasons, not least of which is the ability to iterate over the objects you create.
Secondly, note that the block of code:
varnames <- c()
for(i in 1:2){
varnames <- c(varnames, paste0('a', toString(i)))
}
colnames(df) <- varnames
Can be replaced with the one-liner:
colnames(df) <- paste0("a", 1:2)
Finally, you should take advantage of R's vectorization and the ability to subset with ["colname"] notation. This removes the need for an explicit loop altogether here:
df[paste0("a", 1:2)] <- sapply(1:2, \(i) rep(i * pi, 3))
df
#> a1 a2
#> 1 3.141593 6.283185
#> 2 3.141593 6.283185
#> 3 3.141593 6.283185
I have two matrices:
m1 <- matrix(runif(750), nrow = 50, byrow=T)
m2 <- matrix(rep(TRUE,750), nrow = 50, byrow=T)
For each m1 row, I need to find the indices of the two lowest values. Then, I need to use the remaining indices (i.e. not the two lowest values) to assign FALSE in m2.
It is fairly easy to do for one row:
ind <- order(m1[1,], decreasing=FALSE)[1:2]
m2[1,][-ind] <- FALSE
Therefore, I can use a loop to do the same for all rows:
for (i in 1:dim(m1)[1]){
ind <- order(m1[i,], decreasing=FALSE)[1:2]
m2[i,][-ind] <- FALSE
}
However, in my data set this loop runs slower than I would like (since my matrices are quite large - 500000x150000).
Is there any faster, R way to achieve the same result without the use of loops?
You can try the code below
m2 <- t(apply(m1,1,function(x) x %in% head(sort(x),2)))
You can try apply since you have matrix :
val <- rep(TRUE, ncol(m1))
m3 <- t(apply(m1, 1, function(x) {val[-order(x)[1:2]] <- FALSE;val}))
You can do:
m2 <- t(apply(m1, 1, function(x) rank(x)<3))
Using pmap
library(purrr)
pmap_dfr(as.data.frame(m1), ~ min_rank(c(...)) < 3)
I have a list of dataframes that I would like to multiply for each element of vector.
The first dataframe in the list would be multiplied by the first observation of the vector, and so on, producing another list of dataframes already multiplied.
I tried to do this with a loop, but was unsuccessful. I also tried to imagine something using map or lapply, but I couldn't.
for(i in vec){
for(j in listdf){
listdf2 <- i*listdf[[j]]
}
}
Error in listdf[[j]] : invalid subscript type 'list'
Any idea how to solve this?
*Vector and the List of Dataframes have the same length.
Use Map :
listdf2 <- Map(`*`, listdf, vec)
in purrr this can be done using map2 :
listdf2 <- purrr::map2(listdf, vec, `*`)
If you are interested in for loop solution you just need one loop :
listdf2 <- vector('list', length(listdf))
for (i in seq_along(vec)) {
listdf2[[i]] <- listdf[[i]] * vec[i]
}
data
vec <- c(4, 3, 5)
df <- data.frame(a = 1:5, b = 3:7)
listdf <- list(df, df, df)
There are two lists including many matrices:
df <- data.frame(replicate(100,sample(0:100,100,rep=TRUE)))
l.i <- vector("list")
l.j <- vector("list")
for (var in names(df[1:50])) {
l.i[[var]] <- as.matrix(dist(df[var], "euclidean"))
}
for (var in names(df[51:100])) {
l.j[[var]] <- as.matrix(dist(df[var], "euclidean"))
}
I want to compute Mantel tests between all pairwise elements in l.i and l.j (but not within them). I can do e.g.:
library(vegan)
all.i.vs.j1 <- lapply(l.i, function(x) mantel(x, l.j$X51))
all.i.vs.j2 <- lapply(l.i, function(x) mantel(x, l.j$X52))
and this would be indeed my desired output environment, but i would like to wrap this into a for loop or lapply.
Thank you!
We can use Map to apply the function mantel on corresponding elements of 'l.i' and 'l.j'
library(vegan)
out <- Map(mantel, l.i, l.j)
length(out)
#[1] 50
If we need pairwise, then use outer
f1 <- function(x, y) list(mantel(x, y))
out1 <- outer(l.i, l.j, FUN = Vectorize(f1))
I want to use apply instead of a for-loop. The problem is, my for-loop uses two data.frames as an input. For example:
x <- data.frame(col1=c(1,NA,3,NA), col2=c(9,NA,11,12))
y <- data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8))
output <- rep(NA,2)
for(i in 1:2)
{
output[i] <- sum(is.na(x[,i]))+sum(y[,i])
}
The result here is, correctly c(12,27).
But if I try function and apply:
test <- function(vector1,vector2) sum(is.na(vector1))+sum(vector2)
apply(x,y,MARGIN=2,FUN=test)
With apply the result is c(38,37).
How can I fix this?
You can use mapply instead of apply:
x <- data.frame(col1=c(1,NA,3,NA), col2=c(9,NA,11,12))
y <- data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8))
test <- function(vector1,vector2) sum(is.na(vector1))+sum(vector2)
mapply(test, x, y)
# col1 col2
# 12 27
?mapply