Mapply with margin in R - r

I need to apply a function that takes two arguments on matrices.
mapply(function(x, y) x+y, rbind(1:3, 1:3), rbind(2:4, 2:4))
output is
[1] 3 3 5 5 7 7
which doesn't give me the desired format I want. I need it to retain its matrix form.
On the other hand, apply function in R has an argument margin which helps retain the matrix format but only applies to one argument.
apply(rbind(1:3,1:3), MARGIN = c(1,2), function(x) x+3)
[,1] [,2] [,3]
[1,] 4 5 6
[2,] 4 5 6
The point is there's a MARGIN argument for apply and not something like it for mapply or is there ?
PLEASE: I don't require an answer to rearrange the result, I can do it. I am using this piece of code to write a function that takes a three dimensional meshgrid which will be hassle to rearrange.
EDITED LATER:
I am really sorry, I didn't elaborate this,
Of course, I am not stuck because I wanna do
rbind(1:3, 1:3) + rbind(2:4, 2:4)
These rbinds are just examples of the vectors I am using. And the function(x, y) x+y is also an example of very long nested functions that I can't just copy here which will be so confusing and inefficient. But it is a function of two variables which is relevant for now.

Related

Calculate levenshteinDist between rownames and colnames using mapply

I want to calculate levenshteinDist distance between the rownames and colnames of a matrix using mapply function: Because the volume of may matrix is too big and using a nested loop "for" take a very long time to give me the result.
Here's the old code with nested loop:
mymatrix <- matrix(NA, nrow=ncol(dataframe),ncol=ncol(dataframe),dimnames=list(colnames(dataframe),colnames(dataframe)))
distfunction = function (text1, text2) {return(1 - (levenshteinDist(text1, text2)/max(nchar(text1), nchar(text2))))}
for(i in 1:ncol(mymatrix))
{
for(j in 1:nrow(mymatrix))
mymatrix[i,j]=(distfunction(rownames(mymatrix)[i], colnames(mymatrix)[j]))*100
}
I tried to switch nested loop by mapply:
mapply(distfunction,mymatrix)
It gave me this error:
Error in typeof(str2) : argument "text2" is missing, with no default
I planned to apply the levenshteinDist distance to my matrix and then conclude how to apply myfunction.
Is it possible?
Thank you.
The function mapply cannot be used in this context. It requires two input vectors and the function is applied to the first elements, second elements, .. and so on. But you want all combinations applied.
You could try a stacked sapply
sapply(colnames(mymatrix), function(col)
sapply(rownames(mymatrix), function(row)
distfunction(row, col)))*100
Simple usage example
sapply(1:3, function(x) sapply(1:4, function(y) x*y))
Output:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 4 6
[3,] 3 6 9
[4,] 4 8 12
Update
Even better is to use outer but i think your distfunction is not vectorized (due to the max). So use the wrapper function Vectorize:
distfunction_vec <- Vectorize(distfunction)
outer(rownames(mymatrix), rownames(mymatrix), distfunction_vec)
But I'm not sure about the performance penalty. Better to directly vectorize the function (probably with pmax).

Using the Outer Function

I'm having difficulty using the outer function. I've looked at a few threads, but haven't been able to find a solution.
I have a matrix, prices, with the following information:
25 26
I use the outer function as follows to multiply these numbers together:
a = outer(prices[1,1:2],prices[1,1:2],FUN ="*")
This gives me the following error:
Error in as.vector(X) %*% t(as.vector(Y)) :
requires numeric/complex matrix/vector arguments
If, however, I do the exact same thing, but with the numbers directly, it works as I would like it to:
a = outer(c(25,26),c(25,26),FUN ="*")
and returns a 2x2 matrix with the products.
Any help would be greatly appreciated.
Your prices matrix is apparently a data.frame instead of a matrix. You can either change that:
prices <- as.matrix(prices)
a <- outer(prices[1,1:2],prices[1,1:2],FUN ="*")
or you can just convert to numeric when you use it:
a <- outer(as.numeric(prices[1,1:2]),as.numeric(prices[1,1:2]),FUN ="*")
prices <- matrix(c(25,26), nrow=1)
a = outer(prices[1,1:2],prices[1,1:2],FUN ="*")
# [,1] [,2]
#[1,] 625 650
#[2,] 650 676

Applying a function to a specific element of a list of a list

I have a function that returns a list of vectors and matrices. I then create a variable that is a list of several of the resulting lists from calls to the function. So I have a list of lists. My question is how do I apply a function over the elements of these lists (note this is not the same as applying a function over the lists themselves). Here is a simple example that retains all the essential features of what I am doing
numtrials = 5
x = rep(list(NULL),numtrials)
testfunction = function(){return( list( c(1,2,3,4,5), matrix(runif(10), 2,5),
matrix(0,2,2) ) )}
for(index in 1:numtrials){
x[[index]] = testfunction()
}
I want to now calculate the mean of say the (2,3) element of x[[index]][[2]] across all "index" lists. Or even better get a matrix of means, xbar, such that xbar[i,j] = mean(x[[]][[2]][i,j]). I tried to play around with (and of course read the help file for) lapply, and apply, but couldn't get it to work. One of the reasons is that x[[]][[2]][i,j] appears to be invalid notation
Error in x[[]] : invalid subscript type 'symbol'
I think R doesn't know what to make of the "[[]]". I know some people are going to suggest vectorizing but note that my function returns matrices and vectors of different, unrelated dimensions (although I am not opposed to vectorizing if you have a clever way of doing this).
Using abind you can create a list which contains arrays for the relevent components of the internal lists..
eg
library(abind)
xl <- do.call(mapply, c('abind', x, rev.along = 0))
# the second element from each inner list is now within a 3-d array
# which is the 2nd element of xl
# you can now construct your matrix of mean values by using `apply`
means <- apply(xl[[2]], 1:2, mean)
means
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.4576039 0.5185270 0.7099742 0.3812656 0.4529965
## [2,] 0.6528345 0.2304651 0.5534443 0.4404609 0.7361132
If you know which elements you want to pull out, then it's pretty straightforward to grab them with sapply/lapply, and get the mean:
# Mean of [[2]][2, 3] elements
values = sapply(x, function(elem) {
return(elem[[2]][2, 3])
})
mean(values)
sapply applies a function to each element of the outer list, which is passed in as the elem argument to the little anonymous function I've written. Then you just get the 2nd element of each of those: elem[[2]], and index into it to get the [2, 3] value.
here is the code, it may help you, first unlist and then apply on top of it
ulist=function(x)
{
l=length(x)
for(i in 1:l) lst[[i]]=x[[i]][[2]]
return(lst)
}
#apply mean of each column of the x[[i]][[2]] matrix
sapply(ulist(x),function(y) apply(y,2,mean))
#apply mean of each row of the x[[i]][[2]] matrix
sapply(ulist(x),function(y) apply(y,1,mean))

Concise application of a binary function to rows of a (m x 2) matrix

Given the following matrix with weights in ls in the first column and heihts in the second colum:
> wgt.hgt.matrix
[,1] [,2]
[1,] 180 70
[2,] 156 67
[3,] 128 64
[4,] 118 66
[5,] 202 72
I am looking for a concise way to apply this a binary function like
function(lb, inch) { (lb/inch**2)*703 } -> bmi
to each row of the matrix, resulting in an array, list or vector of with the 5 resulting BMI values. One way I found uses the apply function:
apply(wgt.hgt.matrix, 1, function(row) bmi(row[1], row[2]))
But a splat operator as in Ruby (*) would help making the call more concise and clear:
apply(wgt.hgt.matrix, 1, function(row) bmi(*row))
Does an equivalent to the splat operator exist, i.e. a syntax element telling R to split all vector-like objects to populate argument lists? Are there other, simpler or more concise suggestion for the apply call?
Perhaps I'm missing something, but what's wrong with:
wgt.hgt.matrix <-
structure(c(180L,156L,128L,118L,202L,70L,67L,64L,66L,72L), .Dim=c(5L,2L))
bmi <- function(lb, inch) (lb/inch**2)*703
bmi(wgt.hgt.matrix[,1], wgt.hgt.matrix[,2])
Update:
Based on the OP's comment, it seems like do.call would work more generally:
# put each matrix column in a separate list element
lc <- lapply(1:ncol(wgt.hgt.matrix), function(i) wgt.hgt.matrix[,i])
# call 'bmi' with one argument for each column / list element
do.call(bmi, lc)
Using the bmi() function as a vectorized solution is preferable since it has all vectorized operators, as was illustrated in Joshua's answer. You can also do this with:
colnames(wgt.hgt.matrix) <- c("lb", "inch")
with( as.data.frame(wgt.hgt.matrix), bmi(lb,inch) )
# [1] 25.82449 24.43039 21.96875 19.04362 27.39313
Unfortunately matrices are not good substrate for constructing environments using 'with' so coercing to a dataframe was needed above. You could get an apply solution (which will be less time-efficient than a vectorized approach) to work with a version of bmi() re-ritten to take a vector with named elements (as created above):
bmi <- function(vec) { (vec['lb']/vec['inch']**2)*703 }
apply(wgt.hgt.matrix, 1, function(row) bmi(row ) )
# [1] 25.82449 24.43039 21.96875 19.04362 27.39313
We can get pretty close to the syntax you're looking for with do.call:
## Setup
wgt.hgt.matrix=matrix(c(180,70,156,67,128,64,118,66,202,72),ncol=2,byrow=TRUE)
bmi = function(lb, inch) { (lb/inch**2)*703 }
## The action
apply(wgt.hgt.matrix, 1, function(row) do.call(bmi,as.list(row)))
do.call() is actually more flexible than just a splat operator, in that you can use the list names to give the argument names.

R - Apply function on every row, with data from the row

Is there a faster way to do this:
for (i in 1:nrow(dataframe))
{
dataframe$results <- with(dataframe, myownfunction(column1[i],
column2[i], column3[i], column4[i], column5[i], column6[i])
}
myownfunction finds implied volatility using uniroot(), but when uniroot does not find a solution it stops (usually because there some faults in the data), and the loop stops. Is there a way to make the function just output NA if it gets an error from uniroot and continue with the next row?
regards.
Part1: It's very likely that you could succeed with:
ave( dataframe[, c("column1", column2", column3", "column4", "column5", "column6")], myownfunction)
Part2: If you modified the function to test for failure with try and returned NA when it fails you can fill in the missing data in the result properly.
You seem to have to questions:
1) Returning a value if failure happens. As posted already, look at failwith in the plyr package. It does exactly this.
2) Speed up the for loop. Have you tried using mapply? It is a multivariate apply that applies a function to the each element of each argument simultaneously. So
mapply(myfunc, column1, column2, column3, column4)
(along with modifying myfunc to use failwith) would do the sort of thing you are looking for.
The plyr version of mapply is mdply if you prefer.
Got this from another forum, I like it more than the previous options:
R> M <- matrix(1:6, nrow=3, byrow=TRUE)
R> M
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
R> apply(M, 1, function(x) 2*x[1]+x[2])
[1] 4 10 16

Resources