I would like to get a feel of functional programming in R.
To that effect, I would like to write the vandermonde matrix computation, as it can involve a few constructs.
In imperative style that would be :
vandermonde.direct <- function (alpha, n)
{
if (!is.vector(alpha)) stop("argument alpha is not a vector")
if (!is.numeric(alpha)) stop("argument n is not a numeric vector")
m <- length(alpha)
V <- matrix(0, nrow = m, ncol = n)
V[, 1] <- rep(1, m)
j <- 2
while (j <= n) {
V[, j] <- alpha^(j - 1)
j <- j + 1
}
return(V)
}
How would you write that elegantly in R in functional style ?
The following does not work :
x10 <- runif(10)
n <- 3
Reduce(cbind, aaply(seq_len(n-1),1, function (i) { function (x) {x**i}}), matrix(1,length(x10),1))
As it tells me Error: Results must have one or more dimensions. for list of function which go from i in seq_len(3-1) to the function x -> x**i.
It does not seem very natural to use Reduce for this task.
The error message is caused by aaply, which tries to return an array:
you can use alply instead; you also need to call your functions, somewhere.
Here are a few idiomatic alternatives:
outer( x10, 0:n, `^` )
t(sapply( x10, function(u) u^(0:n) ))
sapply( 0:3, function(k) x10^k )
Here it is with Reduce:
m <- as.data.frame(Reduce(f=function(left, right) left * x10,
x=1:(n-1), init=rep(1,length(x10)), accumulate=TRUE))
names(m) <- 1:n - 1
Here's another option, that uses the environment features of R:
vdm <- function(a)
{
function(i, j) a[i]^(j-1)
}
This will work for arbitrary n (the number of columns).
To create the "Vandermonde functional" for a given a, use this:
v <- vdm(a=c(10,100))
To build a matrix all at once, use this:
> outer(1:3, 1:4, v)
[,1] [,2] [,3] [,4]
[1,] 1 10 100 1e+03
[2,] 1 100 10000 1e+06
[3,] 1 NA NA NA
Note that index a[3] is out of bounds, thus returning NA (except for the first column, which is 1).
Related
I am trying to create a function for theoretical hessian matrix that I can then evaluate at different locations. First I tried setting expressions as values in a matrix or array, but although I could initially set an expression into a matrix I couldn't replace with the value calculated.
hessian_matrix <- function(gx, respect_to){
out_mat <- matrix(0, nrow=length(respect_to), ncol=length(respect_to))
for(i in 1:length(respect_to)){
for(j in 1:length(respect_to)){
dthetad2x <- deriv(D(gx, respect_to[i]), respect_to[j], function.arg=TRUE)
# also tried
# dthetad2x <- as.expression(D(D(gx, respect_to[i])))
out_mat[i,j] <- dthetad2x
}
return(out_mat)
}
Because that didn't work, I decided to create an environment to house the indeces of the hessian matrix as object.
hessian_matrix <- function(gx, respect_to){
out_env <- new.env()
for(i in 1:length(respect_to)){
for(j in 1:length(respect_to)){
dthetad2x <- as.call(D(D(gx, respect_to[i]), respect_to[j]))
assign(paste0(i,j), dthetad2x, out_env)
}
}
return(out_env)
}
g <- expression(x^3-2*x*y-y^6)
h_g <- hessian_matrix(g, respect_to = c('x', 'y'))
This worked, and when I pass this in as a parameter to evaluate I can see the expression, but I can't evaluate it. I tried with call(), eval(), do.call(), get(), etc. and it didn't work. I also assigning the answer within the environment passed, making a new environment to return, or simply using variables.
fisher_observed <- function(h, at_val, params, sum=TRUE){
out_env <- new.env()
# add params to passed environment
for(i in 1:length(at_val)){
h[[names(at_val)[i]]] <- unname(at_val[i])
}
for(i in ls(h)){
value <- do.call(i, envir=h, at_val)
assign(i, value, out_env)
}
return(h)
}
fisher_observed(h_g, at_val=list(x=1,y=2))
According the code for do.call() this is how it should be used, but it isn't working when passed as a parameter in this way.
R already has the hessian matrix function. You do not have to write one. You could use deriv or deriv3 as shown below:
g <- expression(x^3 - 2 * x * y - y^6)
eval(deriv3(g, c('x','y')),list(x=1,y=2))
[1] -67
attr(,"gradient")
x y
[1,] -1 -194
attr(,"hessian")
, , x
x y
[1,] 6 -2
, , y
x y
[1,] -2 -480
If you want to use a function, you could do:
hessian <- function(expr,values){
nms <- names(values)
f <- eval(deriv3(g, nms),as.list(values))
matrix(attr(f, 'hessian'), length(values), dimnames = list(nms,nms))
}
hessian(g, c(x=1,y=2))
x y
x 6 -2
y -2 -480
Although the function is not necessary as you would do double computation in case you wanted the gradient and hessian
I think this (almost) does what you're looking for:
fisher_observed <- function(h, at_val) {
values <- numeric(length = length(names(h)))
for (i in seq_len(length(names(h)))) {
values[i] = purrr::pmap(.l = at_val, function(x, y) eval(h[[names(h)[i]]]))
}
names(values) = names(h)
return(values)
}
This currently returns a named list of evaluated points:
$`21`
[1] -2
$`22`
[1] -480
$`11`
[1] 6
$`12`
[1] -2
you'd still need to re-arrange this into a matrix (should be fairly easy given the column names are preserved. I think the key thing is that the names must be characters when looking up values in h_g.
You cannot have a matrix of "calls" but you can have a character matrix then evaluate it:
hessian_matrix <- function(gx, respect_to){
out_mat <- matrix("", nrow=length(respect_to), ncol=length(respect_to))
for(i in 1:length(respect_to)){
for(j in 1:length(respect_to)){
dthetad2x <- D(D(gx, respect_to[i]), respect_to[j])
out_mat[i,j] <- deparse(dthetad2x)
}
}
return(out_mat)
}
g <- expression(x^3-2*x*y-y^6)
h_g <- hessian_matrix(g, respect_to = c('x', 'y'))
h_g
#> [,1] [,2]
#> [1,] "3 * (2 * x)" "-2"
#> [2,] "-2" "-(6 * (5 * y^4))"
apply(h_g, 1:2, \(x) eval(str2lang(x), list(x=1, y=2)))
#> [,1] [,2]
#> [1,] 6 -2
#> [2,] -2 -480
I am trying to code a function which will identify which row of an nxm matrix M is closest to a vector y of length m.
What am I doing wrong in my code please? I am aiming for the function to produce a column vector of length n which gives the distance between each row coordinates of the matrix and the vector y. I then want to output the row number of the Matrix for which is the closest point to the vector.
closest.point <- function(M, y) {
p <- length(y)
k <- nrow(M)
T <- matrix(nrow=k)
T <- for(i in 1:n)
for(j in 1:m) {
(X[i,j] - x[j])^2 + (X[i,j] - x[j])^2
}
W <- rowSums(T)
max(W)
df[which.max(W),]
}
Even though there is already a better approach (not using for loops when dealing with matrices) to the problem, I would like to give you a solution to your approach with a for loop.
There were some mistakes in your function. There are some undefined variables like n, m or X.
Also try to avoid to name variables as T, because R interprets T as TRUE. It works but could result in some errors if one uses T as TRUE in the following code lines.
When looping, you need to give an index to your variable that you are updating, like T.matrix[i, j] and not only T.matrix as this will overwrite T.matrix at every iteration.
closest.point <- function(M, y) {
k <- nrow(M)
m <- ncol(M)
T.matrix <- matrix(nrow = k, ncol = m)
for (i in 1:k) {
for (j in 1:m) {
T.matrix[i, j] <- (M[i,j] - y[j])^2 + (M[i,j] - y[j])^2
}
}
W <- rowSums(T.matrix)
return(which.min(W))
}
# example 1
closest.point(M = rbind(c(1, 1, 1),
c(1, 2, 5)),
y = cbind(c(1, 2, 5)))
# [1] 2
# example 2
closest.point(M = rbind(c(1, 1, 1, 1),
c(1, 2, 5, 7)),
y = cbind(c(2, 2, 6, 2)))
# [1] 2
You should try to avoid using for loop to do operations on vectors and matrices. The dist base function calculates distances. Then which.min will give you the index of the minimal distance.
set.seed(0)
M <- matrix(rnorm(100), ncol = 5)
y <- rnorm(5)
closest_point <- function(M, y) {
dist_mat <- as.matrix(dist(rbind(M, y)))
all_distances <- dist_mat[1:nrow(M),ncol(dist_mat)]
which.min(all_distances)
}
closest_point(M, y)
#>
#> 14
Created on 2021-12-10 by the reprex package (v2.0.1)
Hope this makes sense, let me know if you have questions.
There are a number of problems here
p is defined but never used.
Although not wrong T does not really have to be a matrix. It would be sufficient to have it be a vector.
Although not wrong using T as a variable is dangerous because T also means TRUE.
The code defines T and them immediately throws it away in the next statement overwriting it. The prior statement defining T is never used.
for always has the value of NULL so assigning it to T is pointless.
the double for loop doesn't do anything. There are no assignments in it so the loops have no effect.
the loops refer to m, n, X and x but these are nowhere defined.
(X[i,j] - x[j])^2 is repeated. It is only needed once.
Writing max(W) on a line by itself has no effect. It only causes printing to be done if done directly in the console. If done in a function it has no effect. If you meant to print it then write print(max(W)).
We want the closest point, not the farthest point, so max should be min.
df is used in the last line but is not defined anywhere.
The question is incomplete without a test run.
I have tried to make the minimum changes to make this work:
closest.point <- function(M, y) {
nr <- nrow(M)
nc <- ncol(M)
W <- numeric(nr) # vector having nr zeros
for(i in 1:nr) {
for(j in 1:nc) {
W[i] <- W[i] + (M[i,j] - y[j])^2
}
}
print(W)
print(min(W))
M[which.min(W),]
}
set.seed(123)
M <- matrix(rnorm(12), 4); M
## [,1] [,2] [,3]
## [1,] -0.56047565 0.1292877 -0.6868529
## [2,] -0.23017749 1.7150650 -0.4456620
## [3,] 1.55870831 0.4609162 1.2240818
## [4,] 0.07050839 -1.2650612 0.3598138
y <- rnorm(3); y
## [1] 0.4007715 0.1106827 -0.5558411
closest.point(M, y)
## [1] 0.9415062 2.9842785 4.6316069 2.8401691 <--- W
## [1] 0.9415062 <--- min(W)
## [1] -0.5604756 0.1292877 -0.6868529 <-- closest row
That said the calculation of the closest row can be done in this function with a one-line body. We transpose M and then subtract y from it which will subtract y from each column but the columns of the transpose are the rows of M so this subtracts y from each row. Then take the column sums of the squared differences and find which one is least. Subscript M using that.
closest.point2 <- function(M, y) {
M[which.min(colSums((t(M) - y)^2)), ]
}
closest.point2(M, y)
## [1] -0.5604756 0.1292877 -0.6868529 <-- closest row
I'm given a question in R language to find the 30th term of the recurrence relation x(n) = 2*x(n-1) - x(n-2), where x(1) = 0 and x(2) = 1. I know the answer is 29 from mathematical deduction. But as a newbie to R, I'm slightly confused by how to make things work here. The following is my code:
loop <- function(n){
a <- 0
b <- 1
for (i in 1:30){
a <- b
b <- 2*b - a
}
return(a)
}
loop(30)
I'm returned 1 as a result, which is way off.
In case you're wondering why this looks Python-ish, I've mostly only been exposed to Python programming thus far (I'm new to programming in general). I've tried to check out all the syntax in R, but I suppose my logic is quite fixed by Python. Can someone help me out in this case? In addition, does R have any resources like PythonTutor to help visualise the code execution logic?
Thank you!
I guess what you need might be something like below
loop <- function(n){
if (n<=2) return(n-1)
a <- 0
b <- 1
for (i in 3:n){
a_new <- b
b <- 2*b - a
a <- a_new
}
return(b)
}
then
> loop(30)
[1] 29
If you need a recursion version, below is one realization
loop <- function(n) {
if (n<=2) return(n-1)
2*loop(n-1)-loop(n-2)
}
which also gives
> loop(30)
[1] 29
You can solve it another couple of ways.
Solve the linear homogeneous recurrence relation, let
x(n) = r^n
plugging into the recurrence relation, you get the quadratic
r^n-2*r^(n-1)+r^(n-2) = 0
, i.e.,
r^2-2*r+1=0
, i.e.,
r = 1, 1
leading to general solution
x(n) = c1 * 1^n + c2 * n * 1^n = c1 + n * c2
and with x(1) = 0 and x(2) = 1, you get c2 = 1, c1 = -1, s.t.,
x(n) = n - 1
=> x(30) = 29
Hence, R code to compute x(n) as a function of n is trivial, as shown below:
x <- function(n) {
return (n-1)
}
x(30)
#29
Use matrix powers (first find the following matrix A from the recurrence relation):
(The matrix A has algebraic / geometric multiplicity, its corresponding eigenvectors matrix is singular, otherwise you could use spectral decomposition yourself for fast computation of matrix powers, here we shall use the library expm as shown below)
library(expm)
A <- matrix(c(2,1,-1,0), nrow=2)
A %^% 29 %*% c(1,0) # [x(31) x(30)]T = A^29.[x(2) x(1)]T
# [,1]
# [1,] 30 # x(31)
# [2,] 29 # x(30)
# compute x(n)
x <- function(n) {
(A %^% (n-1) %*% c(1,0))[2]
}
x(30)
# 29
You're not using the variable you're iterating on in the loop, so nothing is updating.
loop <- function(n){
a <- 0
b <- 1
for (i in 1:30){
a <- b
b <- 2*i - a
}
return(a)
}
You could define a recursive function.
f <- function(x, n) {
n <- 1:n
r <- function(n) {
if (length(n) == 2) x[2]
else r({
x <<- c(x[2], 2*x[2] - x[1])
n[-1]
})
}
r(n)
}
x <- c(0, 1)
f(x, 30)
# [1] 29
In Wolfram Mathematica, there is function NestList[f,x,n] that produces vector output of length n+1 with multiple application of function f on variable x. See documentation.
Is there something similar in R?
Executing do.call would make the same computations multiple times.
Example (reaction to USER_1's suggestion):
foo <- function(x) {x+1}
map(0, foo)
# [[1]]
# [1] 1
Just write one. Such a function has to loop anyway (rescursion is not advisable if n can get large).
NestList <- function(f, x, n) {
stopifnot(n > 0)
res <- rep(x, n + 1)
if (n == 1L) return(res)
for (i in seq_len(n)) res[i+1] <- f(res[i])
res
}
NestList(function(x) x^2, 2, 5)
#[1] 2 4 16 256 65536 4294967296
I understand how outer() works in R:
> outer(c(1,2,4),c(8,16,32), "*")
[,1] [,2] [,3]
[1,] 8 16 32
[2,] 16 32 64
[3,] 32 64 128
It basically takes 2 vectors, finds the crossproduct of those vectors, and then applies the function to each pair in the crossproduct.
I don't have two vectors, however. I have two lists of matrices:
M = list();
M[[1]] = matrix(...)
M[[2]] = matrix(...)
M[[3]] = matrix(...)
And I want to do an operation on my list of matricies. I want to do:
outer(M, M, "*")
In this case, I want to take the dot product of each combination of matrices I have.
Actually, I am trying to generate a kernel matrix (and I have written a kernel function), so I want to do:
outer(M, M, kernelFunction)
where kernelFunction calculates a distance between my two matrices.
The problem is that outer() only takes "vector" arguments, rather than "list"s etc. Is there a function that does the equivalent of outer() for non-vector entities?
Alternately, I could use a for-loop to do this:
M = list() # Each element in M is a matrix
for (i in 1:numElements)
{
for (j in 1:numElements)
{
k = kernelFunction(M[[i]], M[[j]])
kernelMatrix[i,j] = k;
}
}
but I am trying to avoid this in favor of an R construct (which might be more efficient). (Yes I know I can modify the for-loop to compute the diagonal matrix and save 50% of the computations. But that's not the code that I'm trying to optimize!)
Is this possible? Any thoughts/suggestions?
The outer function actually DOES work on lists, but the function that you provide gets the two input vectors repeated so that they contain all possible combinations...
As for which is faster, combining outer with vapply is 3x faster than the double for-loop on my machine. If the actual kernel function does "real work", the difference in looping speed is probably not so important.
f1 <- function(a,b, fun) {
outer(a, b, function(x,y) vapply(seq_along(x), function(i) fun(x[[i]], y[[i]]), numeric(1)))
}
f2 <- function(a,b, fun) {
kernelMatrix <- matrix(0L, length(a), length(b))
for (i in seq_along(a))
{
for (j in seq_along(b))
{
kernelMatrix[i,j] = fun(a[[i]], b[[j]])
}
}
kernelMatrix
}
n <- 300
m <- 2
a <- lapply(1:n, function(x) matrix(runif(m*m),m))
b <- lapply(1:n, function(x) matrix(runif(m*m),m))
kernelFunction <- function(x,y) 0 # dummy, so we only measure the loop overhead
> system.time( r1 <- f1(a,b, kernelFunction) )
user system elapsed
0.08 0.00 0.07
> system.time( r2 <- f2(a,b, kernelFunction) )
user system elapsed
0.23 0.00 0.23
> identical(r1, r2)
[1] TRUE
Just use the for loop. Any built-in functions will degenerate to that anyway, and you'll lose clarity of expression, unless you carefully build a function that generalises outer to work with lists.
The biggest improvement you could make would be to preallocate the matrix:
M <- list()
length(M) <- numElements ^ 2
dim(M) <- c(numElements, numElements)
PS. A list is a vector.
Although this is an old question, here is another solution that is more in the spirit of the outer function. The idea is to apply outer along the indices of list1 and list2:
cor2 <- Vectorize(function(x,y) {
vec1 <- list1[[x]]
vec2 <- list2[[y]]
cor(vec1,vec2,method="spearman")
})
outer(1:length(list1), 1:length(list2), cor2)