R: how to use symbols as a FUN for apply? - r

In R's documentation for apply, it says:
FUN: the function to be applied: see ‘Details’. In the case of functions like +, %*%, etc., the function name must be backquoted or quoted.
I don't understand the latter half the sentence.
When I do
matrix1 = matrix(rnorm(3*4), 3, 4)
apply(matrix1, 1, "+")
I get the transpose of the matrix
And when I do
apply(matrix, 1, "%*%")
I get an error.
I'm trying to get the row-wise sum and product of this matrix.
Also, if that's not what the documentation is talking about, what do does + and %*% supposed to do when supplied as the FUN argument of apply?

matrix1 = matrix(rnorm(3*4), 3, 4)
apply(matrix1, 1, "+")
Does something like the transpose because it supplies rows of the matrix1 object one by one and returns the values of each operation as columns. If you had on the other hand specified:
apply(matrix1, 2, "+")
There would not have been the appearance of transposition because apply always returns its values as a column-major result.
In the second instance, you didn't give a second argument to the "%*%" operator. The "+" operator can be either unary or binary but the "%*%" operator is always binary. (It doesn't really make a lot of sense to use "%*%" with apply and a single dimension since "%*%" is really designed as a standalone operator. If you want the row-wise sum then just use:
rowSums(matrix1)
But you could have used the slower:
apply(matrix1, 1, sum)
For product use:
apply(matrix1, 1, prod)
Neither + nor %*% are designed to collapse their arguments into a single value in contrast to sum and prod which are so designed.
Reply to comment. The %*% operator performs the "matrix multiply" operation. The i-columns of the first argument are multiplied by the J-rows and summed to deliver the i-j element of a new matrix. Many mathematical operations with statistical or physical meaning which would otherwise require a double for-loop can be accomplished by matrix-multiply. Let's imagine your matrix was a bunch of data values and you wanted to come up with a prediction for each row based on a model with three coefficients equal to say, c(5,6,7):
c(5,6,7) %*% matrix1
# [,1] [,2] [,3] [,4]
#[1,] 2.047344 10.02339 1.73618 0.7223964

Related

What are the rules for threading a function over a vector in R?

I have some code which I call with two vectors of different length, lets call them A and B. However, I wrote the function having in mind a single element of A with the expectation that it will be automatically threaded over A. To be concrete,
A <- rnorm(5)
B <- rnorm(30)
foo <- function(x,B){
sum( cos(x*B) ) # calculate sum_i cos(x*B[i])
}
sum( exp(foo(A,B)) ) # expecting this to calculate the exponent for each A[j] and add over j
I need to get
Σ_j exp( Σ_i cos(A[j]*B[i])
and not
Σ_ij exp(cos(A[j]*B[i])) OR exp(cos(Σ_ij A[j]*B[i]))
I suspect that the last R expression is ambiguous, since the declaration of foo does not know B is always a vector. What are the formal rules and am I right to worry about the ambiguity?
If we want to loop over the 'A', then use sapply , and apply the foo on each of the elements of 'A' with anonymous function call and get the sum of the output vector
sum(exp(sapply(A, function(x) foo(x, B))))
In the OP's example with the expression foo(A, B), the product A*B is computed first, and since the lengths of A and B are unequal, the recycling rule takes priority. There is no error message coming out, just because by pure luck the vector length of one is a multiple of the other.
You can also Vectorize the x input. I think this is what you were expecting. At the end of the day, this will work it's way down to an mappy() implementation which is a multivariate sapply, so probably best to just do it yourself as with the solution from akrun.
foo2 <- Vectorize(foo, "x")
sum(exp(foo2(A, B)))
The "formal rules" as you put them is quite simply how R does help("Arithmetic").
The binary operators return vectors containing the result of the element by element operations. If involving a zero-length vector the result has length zero. Otherwise, the elements of shorter vectors are recycled as necessary (with a warning when they are recycled only fractionally). The operators are + for addition, - for subtraction, * for multiplication, / for division and ^ for exponentiation.
So when you use x*B, it is doing element-wise multiplication. Nothing changes when you pass A into the function instead of x.
Simply go through your lines one at a time.
x*B will be a vector of length max(length(x, B)). When they are not of the same length, R will recycle elements of the shorter vector (i.e., repeat them).
cos(x*B) will be a vector of the same length as step (1), but now the cosine of that value.
sum( cos(x*B) ) will sum that vector, returning a single number.
foo(A,B) does steps (1) through (3), but with your defined A and B. Note that in your example A is recycled 6 times to get to the length of B. In other words, what you entered as A is being used as rep(A, 6) in the multiplication step. Nothing about a function definition in R says that foo(A,B) should be repeated for each element of vector A. So it behaves literally as you wrote it, basically swapping in A for x in the function code.
exp(foo(A,B)) will take the result from foo from step 3 (which is a scalar) and raise it to an exponent.
sum( exp(foo(A,B)) ) does nothing, since step (5) is a scalar, there is nothing to sum.

why mean not working in Reduce?

There are two examples of function Reduce() in Hadley Wickham's book Advanced R. Both work well.
Reduce(`+`, 1:3) # -> ((1 + 2) + 3)
Reduce(sum, 1:3) # -> sum(sum(1, 2), 3)
However, when using mean in Reduce(), it does not follow the same pattern. The outcome is always the first element of the list.
> Reduce(mean, 1:3)
[1] 1
> Reduce(mean, 4:2)
[1] 4
The two functions sum() and mean() are very similar. Why one works fine with Reduce(), but the other does not? How do I know a if a function behaves normally in Reduce() before it gives incorrect result?
This has to do with the fact that, unlike sum or +, mean expects a single argument (re: a vector of values), and as such cannot be applied in the manner that Reduce operates, namely:
Reduce uses a binary function to successively combine the elements of
a given vector and a possibly given initial value.
Take note of the signature of mean:
mean(x, ...)
When you pass multiple values to it, the function will match x to the first value and ignore the rest. For example, when you call Reduce(mean, 1:3), this is more or less what is going on:
mean(1, 2)
#[1] 1
mean(mean(1, 2), 3)
#[1] 1
Compare this with the behavior of sum, which accept a variable number of values:
sum(1, 2)
#[1] 3
sum(sum(1, 2), 3)
#[1] 6

Need help vectorizing a double for-loop creating a matrix of norms of vector differences in R

I'm trying to figure out how to vectorize the following code block in R:
X is an N x M matrix
centers is a K x M matrix
phi <- matrix(0, nrow(X), nrow(centers))
for(i in 1:nrow(phi)) {
for(j in 1:ncol(phi)) {
phi[i, j] <- norm(as.matrix(X[i, ]) - as.matrix(centers[j, ]), type = 'F')
}
}
I'm constructing an N x K matrix, phi, which at each position, [i, j], contains the norm of the difference between the vectors at row i of X and row j of centers:
phi[i, j] = || X[i, ] - centers[j, ] ||
My approach so far has been to attempt to use R's outer() function. I'm new to the outer() function, so I've searched for several examples, however, the examples I've come across involve using outer() to apply some function to a pair of vectors of scalar values. As I'm dealing with the differences between pairs of rows from two matrices, outer() behaves different than expected. I'm not sure how to get it to recognize the matrices I'm passing it (X and centers) as vectors of vectors, where each row represents a vector to be involved in the computation of phi.
In addition, when I define a function to compute the norm of the difference between two M-length vectors, that function returns a scalar. It is my understanding that in order to vectorize a function using R's Vectorize(), that function must return a result of the same length as its arguments. I'm not sure how to define a function which, when used in conjunction with outer(), recognizes each row of a matrix as a single element (in spite of it being an M-length vector).
Below are a couple of my attempts to use outer() with toy examples of the matrices X and centers.
X <- matrix(c(7,8,9,1,2,3,4,5,6), 3, 3)
centers <- matrix(c(1,2,3,4,5,6), 2, 3)
fun <- function(y, x) norm(as.matrix(y) - as.matrix(x), type = 'F')
outer(X, centers, fun)
This was my first attempt. I was trying to use outer() in a manner analogous to the way it is used when it is passed a pair of vectors. I was (naively) hoping it would take one row from each matrix at a time, pass them as the two arguments to fun, and position the result appropriately in the product matrix. Instead, I get the following error message.
Error in outer(X, centers, fun) :
dims [product 54] do not match the length of object [1]
I also tried vectorizing my function using R's Vectorize() before calling outer().
Vecfun <- Vectorize(fun)
outer(X, centers, Vecfun)
In this case, I no longer get an error message, but the result is an erroneous matrix of matrices. I'm also new to the Vectorize() function, so I'm not too sure why it produces the result that it does as I don't have a real grasp on what it does; using it was sort of a shot in the dark.
I'll appreciate any help in vectorizing my original problem; I'm completely open to suggestions that do not involve outer().
Clarifications regarding outer() and Vectorize() also welcome.

vectorize a bidimensional function in R

I have a some true and predicted labels
truth <- factor(c("+","+","-","+","+","-","-","-","-","-"))
pred <- factor(c("+","+","-","-","+","+","-","-","+","-"))
and I would like to build the confusion matrix.
I have a function that works on unary elements
f <- function(x,y){ sum(y==pred[truth == x])}
however, when I apply it to the outer product, to build the matrix, R seems unhappy.
outer(levels(truth), levels(truth), f)
Error in outer(levels(x), levels(x), f) :
dims [product 4] do not match the length of object [1]
What is the recommended strategy for this in R ?
I can always go through higher order stuff, but that seems clumsy.
I sometimes fail to understand where outer goes wrong, too. For this task I would have used the table function:
> table(truth,pred) # arguably a lot less clumsy than your effort.
pred
truth - +
- 4 2
+ 1 3
In this case, you are test whether a multivalued vector is "==" to a scalar.
outer assumes that the function passed to FUN can take vector arguments and work properly with them. If m and n are the lengths of the two vectors passed to outer, it will first create two vectors of length m*n such that every combination of inputs occurs, and pass these as the two new vectors to FUN. To this, outer expects, that FUN will return another vector of length m*n
The function described in your example doesn't really do this. In fact, it doesn't handle vectors correctly at all.
One way is to define another function that can handle vector inputs properly, or alternatively, if your program actually requires a simple matching, you could use table() as in #DWin 's answer
If you're redefining your function, outer is expecting a function that will be run for inputs:
f(c("+","+","-","-"), c("+","-","+","-"))
and per your example, ought to return,
c(3,1,2,4)
There is also the small matter of decoding the actual meaning of the error:
Again, if m and n are the lengths of the two vectors passed to outer, it will first create a vector of length m*n, and then reshapes it using (basically)
dim(output) = c(m,n)
This is the line that gives an error, because outer is trying to shape the output into a 2x2 matrix (total 2*2 = 4 items) while the function f, assuming no vectorization, has given only 1 output. Hence,
Error in outer(levels(x), levels(x), f) :
dims [product 4] do not match the length of object [1]

How do functions that simultaneously operate over vectors and their elements work in R?

Take the following example:
boltzmann <- function(x, t=0.1) { exp(x/t) / sum(exp(x/t)) }
z=rnorm(10,mean=1,sd=0.5)
exp(z[1]/t)/sum(exp(z/t))
[1] 0.0006599707
boltzmann(z)[1]
[1] 0.0006599707
It appears that exp in the boltzmann function operates over elements and vectors and knows when to do the right thing. Is the sum "unrolling" the input vector and applying the expression on the values? Can someone explain how this works in R?
Edit: Thank you for all of the comments, clarification, and patience with an R n00b. In summary, the reason this works was not immediately obvious to me coming from other languages. Take python for example. You would first compute the sum and then compute the value for each element in the vector.
denom = sum([exp(v / t) for v in x])
vals = [exp(v / t) / denom for v in x]
Whereas is R the sum(exp(x/t)) can be computed inline.
This is explained in An Introduction to R, Section 2.2: Vector arithmetic.
Vectors can be used in arithmetic expressions, in which case the
operations are performed element by element. Vectors occurring in the
same expression need not all be of the same length. If they are not,
the value of the expression is a vector with the same length as the
longest vector which occurs in the expression. Shorter vectors in the
expression are recycled as often as need be (perhaps fractionally)
until they match the length of the longest vector. In particular a
constant is simply repeated. So with the above assignments the command
x <- c(10.4, 5.6, 3.1, 6.4, 21.7)
y <- c(x, 0, x)
v <- 2*x + y + 1
generates a new vector v of length 11 constructed by adding together,
element by element, 2*x repeated 2.2 times, y repeated just once, and
1 repeated 11 times.
This might be clearer if you evaluated the numerator and the denominator separately:
x = rnorm(10,mean=1,sd=0.5)
t = .1
exp(x/t)
# [1] 1.845179e+05 6.679273e+03 4.379369e+06 1.852623e+06 9.960374e+02
# [6] 1.359676e+09 6.154045e+03 1.777027e+01 1.070003e+04 6.217397e+04
sum(exp(x/t))
# [1] 2984044296
Since the numerator is a vector of length 10, and the denominator is a vector of length 1, the division returns a vector of length 10.
Since you're interested in comparing this to Python, imagine the two following rules were added to Python (incidentally, these are similar to the usage of arrays in numpy):
If you divide a list by a number, it will divide all items in the list by the number:
[2, 4, 6, 8] / 2
# [1, 2, 3, 4]
The function exp in Python is "vectorized", which means that when it is applied to a list it will apply to each item in the list. However, sum still works the way you expect it to.
exp([1, 2, 3]) => [exp(1), exp(2), exp(3)]
In that case, imagine how this code would be evaluated in Python:
t = .1
x = [1, 2, 3, 4]
exp(x/t) / sum(exp(x/t))
It would follow the following simplifications, using those two simple rules:
exp([v / t for v in x]) / sum(exp([v / t for v in x]))
[exp(v / t) for v in x] / sum([exp(v / t) for v in x])
Now do you see how it knows the difference?
Vectorisation has several slightly different meanings in R.
It can mean accepting a vector input, transforming each element, and returning a vector (like exp does).
It can also mean accepting a vector input and calculating some summary statistic, then returning a scalar value (like mean does).
sum conforms to the second behaviour, but also has a third vectorisation behaviour, where it will create a summary statistic across inputs. Try sum(1, 2:3, 4:6), for example.

Resources