Changing code based on the apply function to the lapply function - r

I'm trying to formulate the code below using the lapply function (or actually the mclapply function) instead of the apply function. I want it to return a matrix or similar and not a list. The hi function is very complicated in my actual code, so I just presented a very basic example.
hi <- function(a, matrix) {
hi <- a[1] / a[2] * t(matrix) %*% matrix
return(hi)
}
a_1 <- t(matrix(1:4))
a_2 <- t(matrix(5:8))
choleski <- matrix(1:4)
result <- apply(rbind(a_1, a_2), 2, hi, matrix=choleski)
So my question is basically, how do I reformulate the code above using lapply instead of apply, i.e. apply the lapply function to the hi function instead of using the apply procedure. An efficient solution would be awesome.
Thanks.

If you aim to apply hi to each column of rbind(a_1, a_2), then you could do something like this:
A <- rbind(a_1, a_2)
sapply(seq_len(ncol(A)), function(i) hi(A[, i], choleski))
or, more simply (if it's ok to make rbind(a_1, a_2) a data.frame):
A <- as.data.frame(rbind(a_1, a_2))
sapply(A, function(x) hi(x, choleski))
The latter works since a data.frame is a type of list (where columns are its elements), and sapply applies function to list elements. By default, sapply simplifies its output if possible.

Related

How to use lapply with a condition in R to fit only one element each time

Suppose I have two vectors. Suppose further that I would like my function takes only one values of each vector and return me the output. Then, I would like another function to check the values of each run. If the output of the previous run is smaller than the new one. Then, I would like my function to stop and return me all the previous values. My original function is very complicated (estimation models). Hence, I try to provide an example to explain my idea.
Suppose that I have these two vectors:
set.seed(123)
x <- rnorm(1:20)
y <- rnorm(1:20)
Then, I would like to write a function which only takes one values of each vector and multiplied them. Then, return me the output. Then, I would like the function to check if the previous multiplication is smaller than the new one or not. If yes, then stop and return me all the previous multiplication.
I tried this:However, this functions takes all the values at once and return me a list of the multiplication. I was thinking about using lapply, to fit one element at a time but I do not know how to work with the conditions.
myfun <- function(x, y, n){
multi <- list()
for ( i in 1:n){
multi[[i]] <- x[[i]]*y[[i]]
}
return(multi)
}
myfun(x,y,10)
Here is another try
x <- rnorm(1:20)
y <- rnorm(1:20)
myfun <- function(x, y){
multi <- x*y
return(multi)
}
This is the first function. I would like to run it element by element. Each time, I would like it to returns me only one multiplication result. Then, another function (wrapper function) check the result. It the second output of the first function (multiplication function) is larger than the first one, then stop, otherwise keep going.
I would like to write a function which only takes one values of each vector and multiplied them. Then, return me the output. Then, I would like the function to check if the previous multiplication is smaller than the new one or not.
I would like the multiplication in a separate function. Then, I would like to check its output. So, I should have a warper function.
You can apply a for loop with a stopping condition, similar to what you have already:
# example input
set.seed(123)
x <- rnorm(1:20)
y <- rnorm(1:20)
# example function
f = function(xi, yi) xi*yi
# wrapper
stopifnot(length(x) == length(y))
res = vector(length(x), mode="list")
for (i in seq_along(x)){
res[[i]] = f(x[[i]], y[[i]])
if (i > 1L && res[[i]] > res[[i-1L]]) break
}
res[seq_len(i)]
Comments:
It is better to predefine the max length res might need (here, length(x)), rather than expanding it in the loop.
For this function (multiplication), there is no good reason to proceed elementwise. R's multiplication function is vectorized and fast.
You don't need to use a list-class output for this function, since it is returning doubles; res = double(length(x)) should also work.
You don't need to use list-style accessors for x, y and res unless lists are involved; res[i] = f(x[i], y[i]) should work, etc.

call function per vector element

I have written a simple function to calculate the average power of measurements in dBm. First I had to write the function to convert dBm to watts, find the average and the convert the value back to dBm.
This works great for a single vector
For example:
MeanDB <- function(dBVector) {
# Returns the variance for all the input data. First converts data to linear scale. Then mean() is applied.
# Then Data are converted back to log scale
return(10*log10(mean(10^(dBVector/10))))
}
Now I would like to apply the same function for elements of two vectors for example the vector1 and vector2.
I would like to call my written function for each pair of elements from vector 1 and vector 2 (these are of same size).
The easiest will be ofc a for loop
keepResults<-vector()
for i in seq(1,length(vector1)){
keepResults<-MeanDb(vector1[i],vector2[i])
}
but I am quite sure there should be in R a more efficient alternatives. Can you provide commands in R that can do that in a shorter way?
Regards and thanks
Alex
There is no need for *apply loops. Simply write a vectorized function:
MeanDB <- function(...) {
stopifnot(length(unique(lengths(list(...)))) == 1L)
M <- cbind(...)
return(10 * log10(rowMeans(10 ^ (M / 10))))
}
a = c(1, 2, 3)
b = c(2, 3, 4)
MeanDB(a, b)
This generalizes to an arbitrary number of vectors.
For just two lists, you could rewrite your function as:
MeanDB <- function(number1,number2) {
return(10*log10(mean(10^(c(number1,number2)/10))))
}
# Example of implementation:
a = list(1,2,3)
b = list(2,3,4)
mapply(MeanDB,a,b)
Hope this helps!
You can also use pmap (general case) or map2 (in this case) from the purrr package. If I used the function defined by #F.Maas above then,
MeanDB <- function(number1,number2) {
return(10*log10(mean(10^(c(number1,number2)/10))))
}
pmap_dbl(list(list(1,2,3),list(4,5,6)),MeanDB)

R: How to create a loop for, for a range of data in a function?

I have this parameter:
L_inf <- seq(17,20,by=0.1)
and this function:
fun <- function(x){
L_inf*(1-exp(-B*(x-0)))}
I would to apply this function for a range of value of L_inf.
I tried with loop for, like this:
A <- matrix() # maybe 10 col and 31 row or vice versa
for (i in L_inf){
A[i] <- fun(1:10)
}
Bur R respond: longer object length is not a multiple of shorter object length.
My expected output is a matrix (or data frame, or list maybe) with 10 result (fun(1:10)) for each value of the vector L_inf (lenght=31).
How can to do it?
You are trying to put a vector of 10 elements into one of the matrix cell. You want to assign it to the matrix row instead (you can access the ith row with A[i,]).
But using a for loop in this case is inefficient and it is quite straightforward to use one of the "apply" function. Apply functions typically return a list (which is the most versatile container since there is basically no constraint).
Here sapply is an apply function which tries to Simplify its result to a convenient data structure. In this case, since all results have the same length (10), sapply will simplify the result to a matrix.
Note that I modified your function to make it explicitly depend on L_inf. Otherwise it will not do what you think it should do (see keyword "closures" if you want more info).
L_inf_range <- seq(17,20,by=0.1)
B <- 1
fun <- function(x, L_inf) {
L_inf*(1-exp(-B*(x-0)))
}
sapply(L_inf_range, function(L) fun(1:10, L_inf=L))

mapply - passing row and column of element as argument

I'm new to R programming and I know I could write a loop to do this, but everything I read says that for simplicity its best to avoid loops and use apply instead.
I have a matrix and i would like to run this function on each element in the matrix.
cellresidue <- function(i,j){
result <- (cluster[i,j] - cluster.I[i,] - cluster.J[j,] - cluster.IJ)/(cluster.N*cluster.M)
return (result)
}
i= element row
j= element column
cluster.J is a matrix of column means
cluster.I is a matrix of row means
cluster.IJ is the mean of the entire matrix named cluster
What I can't figure out is how do I get the row and column of the element (I think should use row() and column col() functions) that mapply is working with and how do pass those arguments to mapply or apply?
There is no need for loops or *apply functions. You can just use plain matrix operations:
nI <- nrows(cluster)
nJ <- ncols(cluster)
cluster.I <- matrix(rowMeans(cluster), nI, nJ, byrow = FALSE)
cluster.J <- matrix(rowMeans(cluster), nI, nJ, byrow = TRUE)
cluster.IJ <- matrix( mean(cluster), nI, nJ)
residue.mat <- (cluster - cluster.I - cluster.J - cluster.IJ) /
(cluster.N * cluster.M)
(You did not explain what cluster.N and cluster.M are but I assume they are scalars)
It is not clear from your question what you are trying to do. It is best on this site to provide some mock data (preferably generated by the code, not pasted), and then show what form the end result should look like. It seems that the apply family is not what you seek.
Quick disambiguation between apply, sapply and mapply:
#providing data for examples
X=matrix(rnorm(9),3,3)
apply: apply a function to either columns (2) or rows (1) of a matrix or array
#here, sum by columns, same as colSums(X)
apply(X, 2, sum)
sapply: apply a function against (usually) a list of objects
#create a list with three vectors
mylist=list(1:4, 5:10, c(1,1,1))
#get the mean of each vector
sapply(mylist, mean)
#remove 2 to each element of X, same as c(X-2)
sapply(X, FUN=function(x) x-2)
mapply: a multivariate version of sapply, taking an arbitrary number of arguments. Never had much use of it… Some rock-bottom examples:
#same as c(1,2,3,4) + c(15,16,17,18)
mapply(sum, 1:4, 15:18)
#same as c(X+X), the vectorized matrix sum
mapply(sum, X, X)
Side note: It's perfectly ok to use loops in R; use whichever suits the best your thoughts. The issue is that if you have a "really big" number of iterations, this is where you could meet bottlenecks, depending on your patience. There are two solutions to this: rewrite your function in C/FORTRAN (and boost speed), or use built-in functions if applicable (which are, by the way, often writen in C or FORTRAN).

Quantiles of a data.frame

There is a data.frame() for which's columns I'd like to calculate quantiles:
tert <- c(0:3)/3
data <- dbGetQuery(dbCon, "SELECT * FROM tablename")
quans <- mapply(quantile, data, probs=tert, name=FALSE)
But the result only contains the last element of quantiles return list and not the whole result. I also get a warning longer argument not a multiple of length of shorter. How can I modify my code to make it work?
PS: The function alone works like a charme, so I could use a for loop:
quans <- quantile(a$fileName, probs=tert, name=FALSE)
PPS: What also works is not specifying probs
quans <- mapply(quantile, data, name=FALSE)
The problem is that mapply is trying to apply the given function to each of the elements of all of the specified arguments in sequence. Since you only want to do this for one argument, you should use lapply, not mapply:
lapply(data, quantile, probs=tert, name=FALSE)
Alternatively, you can still use mapply but specify the arguments that are not to be looped over in the MoreArgs argument.
mapply(quantile, data, MoreArgs=list(probs=tert, name=FALSE))
I finally found a workaround which I don't like but kinda works. Perhaps someone can tell the right way to do it:
q <- function(x) { quantile(x, probs=c(0:3)/3, names=FALSE) }
mapply(q, data)
works, no Idea where the difference is.

Resources