R: Generate matrix from function - r

In R I'm interested in the general case to generate a matrix from a formula such as:
X = some other matrix
Y(i, j) = X(i, j) + Y(i - 1, j - 1)
Unfortunately I can't find how to account for the matrix self-referencing.
Obviously order of execution and bounds checking are factors here, but I imagine these could be accounted for by the matrix orientation and formula respetively.
Thanks.

This solution assumes that you want Y[1,n] == X[1,n] and Y[n,1] == X[n,1]. If not, you can apply the same solution on the sub-matrix X[-1,-1] to fill in the values of Y[-1,-1]. It also assumes that the input matrix is square.
We use the fact that Y[N,N] = X[N,N] + X[N-1, N-1] + ... + X[1,1] plus similar relations for off-diagonal elements. Note that off-diagonal elements are a diagonal of a particular sub-matrix.
# Example input
X <- matrix(1:16, ncol=4)
Y <- matrix(0, ncol=ncol(X), nrow=nrow(X))
diag(Y) <- cumsum(diag(X))
Y[1,ncol(X)] <- X[1,ncol(X)]
Y[nrow(X),1] <- X[nrow(X),1]
for (i in 1:(nrow(X)-2)) {
ind <- seq(i)
diag(Y[-ind,]) <- cumsum(diag(X[-ind,])) # lower triangle
diag(Y[,-ind]) <- cumsum(diag(X[,-ind])) # upper triangle
}

Well, you can always use a for loop:
Y <- matrix(0, ncol=3, nrow=3)
#boundary values:
Y[1,] <- 1
Y[,1] <- 2
X <- matrix(1:9, ncol=3)
for (i in 2:nrow(Y)) {
for (j in 2:ncol(Y)) {
Y[i, j] <- X[i, j] + Y[i-1, j-1]
}
}
If that is too slow you can translate it to C++ (using Rcpp) easily.

Related

Save output in multiple for loops with if-else

Please see the above equation. As you can see 0<= i <j <= n. I wrote the following command in R. When i=0, we consider X_0 = 0. I generated two observations n=2 and I manually calculated the values. They are, 4.540201, 1.460604. However, my r codes overwrite the results. I got this output 1.460604 1.460604. I couldn't figure it out. What is the reason for that?
I updated the code below.
n = 2
set.seed(2)
x = rexp(n, 1)
xo = sort(x)
xo
value1 = matrix(NA, nrow = 2,2)
for(j in 1:n){
for(i in 0:(j-1)){
value1[i,j] = ifelse(i==0,((n - j + 1)*sum(xo[i+1] - 0)), ((n - j + 1)*sum(xo[i+1] - xo[i])) )
}
}
value1
You could write that in a way more simple way by using matrix multiplication.
Assuming your X_k and your X_i are vectors, you could:
X_k <- as.matrix(X_k)
X_i <- as.matrix(X_i)
difference <- (X_k - X_i)
output <- (n - j + 1) * (t(difference) %*% difference)
Where t() calculates the transpose of a matrix and %*% is matrix multiplication.

Is it possible to use vector math in R for a summation involving intervals?

Title's a little rough, open to suggestions to improve.
I'm trying to calculate time-average covariances for a 500 length vector.
This is the equation we're using
The result I'm hoping for is a vector with an entry for k from 0 to 500 (0 would just be the variance of the whole set).
I've started with something like this, but I know I'll need to reference the gap (i) in the first mean comparison as well:
x <- rnorm(500)
xMean <-mean(x)
i <- seq(1, 500)
dfGam <- data.frame(i)
dfGam$gamma <- (1/(500-dfGam$i))*(sum((x-xMean)*(x[-dfGam$i]-xMean)))
Is it possible to do this using vector math or will I need to use some sort of for loop?
Here's the for loop that I've come up with for the solution:
gamma_func <- function(input_vec) {
output_vec <- c()
input_mean <- mean(input_vec)
iter <- seq(1, length(input_vec)-1)
for(val in iter){
iter2 <- seq((val+1), length(input_vec))
gamma_sum <- 0
for(val2 in iter2){
gamma_sum <- gamma_sum + (input_vec[val2]-input_mean)*(input_vec[val2-val]-input_mean)
}
output_vec[val] <- (1/length(iter2))*gamma_sum
}
return(output_vec)
}
Thanks
Using data.table, mostly for the shift function to make x_{t - k}, you can do this:
library(data.table)
gammabar <- function(k, x){
xbar <- mean(x)
n <- length(x)
df <- data.table(xt = x, xtk = shift(x, k))[!is.na(xtk)]
df[, sum((xt - xbar)*(xtk - xbar))/n]
}
gammabar(k = 10, x)
# [1] -0.1553118
The filter [!is.na(xtk)] starts the sum at t = k + 1, because xtk will be NA for the first k indices due to being shifted by k.
Reproducible x
x <- c(0.376972124936433, 0.301548373935665, -1.0980231706536, -1.13040590360378,
-2.79653431987176, 0.720573498411587, 0.93912102300901, -0.229377746707471,
1.75913134696347, 0.117366786802848, -0.853122822287008, 0.909259181618213,
1.19637295955276, -0.371583903741348, -0.123260233287436, 1.80004311672545,
1.70399587729432, -3.03876460529759, -2.28897494991878, 0.0583034949929225,
2.17436525195634, 1.09818265352131, 0.318220322390854, -0.0731475581637693,
0.834268741278827, 0.198750636733429, 1.29784138432631, 0.936718306241348,
-0.147433193833294, 0.110431994640128, -0.812504663900505, -0.743702167768748,
1.09534507180741, 2.43537370755095, 0.38811846676708, 0.290627670295127,
-0.285598287083935, 0.0760147178373681, -0.560298603759627, 0.447188372143361,
0.908501134499943, -0.505059597708343, -0.301004012157305, -0.726035976548133,
-1.18007702699501, 0.253074712637114, -0.370711296884049, 0.0221795637601637,
0.660044122429767, 0.48879363533552)

R: Efficient way to convert factor into binary matrix

I'd like to convert a size n factor into a n×n binary matrix whose (i, j) element is 1 if i-th and j-th element of factor are same and 0 otherwise.
The following is a naive way to implement what I want to do but this code is quite slow. Is there any more efficient way to do the same thing?
size <- 100
id <- factor(sample(3, size, replace=TRUE))
mat <- matrix(0, nrow=size, ncol=size)
for(i in 1:size){
for(j in 1:size){
if(id[i] == id[j]){
mat[i, j] <- 1
}
}
}
Another alternative, which should be relatively fast
tcrossprod(model.matrix( ~ id + 0))
Similarly to Hong Ooi's answer you can use also sparse matrices
library(Matrix)
tcrossprod(sparse.model.matrix( ~ id + 0))
outer can be used for this.
mat <- outer(id, id, "==")
Since the output is a binary matrix, and O(N^2) objects are kind of large, this is a good use case for sparse matrices:
library(Matrix)
mat <- Matrix(nrow=100, ncol=100)
mat[] <- outer(id, id, "==") # [] means to assign into the existing 'mat' matrix

non-numeric argument to binary operator, AR(1) model

I have an exercise to do where I have to run the following AR(1) model:
xi =c+φxi−1+ηi (i=1,...,T)
I know that ni ~ N(0,1) ; x0 ~ N(c/(1-φ),1/(1-φˆ2)); c= 2 ; φ = 0.6
I am trying to do a for loop. My code is the following:
n <- rnorm(T, 0, 1)
c <- 2
phi <- 0.6
x_0 <- rnorm(1,c/(1-phi), 1/(1-phi**2))
v <- vector("numeric", 0)
#for (i in 2:T){
name <- paste("x", i, sep="_")
v <- c(v,name)
v[1] <- c + phi*x_0 + n[1]
v[i] <- c + phi*v[i-1] + n[i]
}
However, I keep getting this error:
Error in phi * v[i - 1] : non-numeric argument to binary operator
I understand what this error is, but I can't find any solutions to solve it. Could someone please enlighten me? How could I assign numeric values to the name vector?
Thank you!
You're defining v as a numeric vector, but then v <- c(v, name) turns v into a character vector since name is character. That's what's causing the error.
If I'm not mistaken, your intent is to assign names to the values in a numeric vector. That's fine, you just need a different approach.
n <- rnorm(t)
c <- 2
phi <- 0.6
x_0 <- rnorm(1, c/(1-phi), 1/(1-phi^2))
v <- c + phi*x_0 + n[1]
for (i in 2:t) {
v[i] <- c + phi*v[i-1] + n[i]
}
names(v) <- paste("x", 1:t, sep="_")
Vectors in R don't have a static size; they're dynamically resized as needed. So even though we're initializing v with a scalar value, it grows to fit each new value in the loop.
The final step is to give v a list of names. This can be accomplished using names(v) <-. Take a look at v now--it has names!
And as an aside, since T is a synonym for TRUE in R, it's best not to use T as a variable name. Thus I've used t here instead.
I guess you seem to need the following. It'll produces 11 elements including the initial x value. You may exclude it later.
set.seed(1237)
t <- 10
n <- rnorm(t, 0, 1)
c <- 2
phi <- 0.6
x0 <- rnorm(1, c/(1-phi), 1/(1-phi**2))
v <- c(x0, rep(0, t))
for(i in 2:length(v)) {
v[i] <- c + phi * v[i-1] + n[i-1]
}
v
[1] 4.967833 4.535847 2.748292 2.792992 5.389548 6.173001 4.526824 3.790483 4.307981 5.442913 4.958193

Speeding up this tricky matrix calculation

As of now I am computing some features from a large matrix and doing it all in a for-loop. As expected it's very slow. I have been able to vectorize part of the code, but I'm stuck on one part.
I would greatly appreciate some advice/help!
s1 <- MyMatrix #dim = c(5167,256)
fr <- MyVector #vector of length 256
tw <- 5
fw <- 6
# For each point S(t,f) we need the sub-matrix of points S_hat(i,j),
# i in [t - tw, t + tw], j in [f - fw, f + fw] for the feature vector.
# To avoid edge effects, I pad the original matrix with zeros,
# resulting in a matrix of size nobs+2*tw x nfreqs+2*fw
nobs <- dim(s1)[1] #note: this is 5167
nf <- dim(s1)[2] #note: this is 256
sp <- matrix(0, nobs+2*tw, nf+2*fw)
t1 <- tw+1; tn <- nobs+tw
f1 <- fw+1; fn <- nf+fw
sp[t1:tn, f1:fn] <- s1 # embed the actual matrix into the padding
nfeatures <- 1 + (2*tw+1)*(2*fw+1) + 1
fsp <- array(NaN, c(dim(sp),nfeatures))
for (t in t1:tn){
for (f in f1:fn){
fsp[t,f,1] <- fr[(f - f1 + 1)] #this part I can vectorize
fsp[t,f,2:(nfeatures-1)] <- as.vector(sp[(t-tw):(t+tw),(f-fw):(f+fw)]) #this line is the problem
fsp[t,f,nfeatures] <- var(fsp[t,f,2:(nfeatures-1)])
}
}
fspec[t1:tn, f1:fn, 1] <- t(matrix(rep(fr,(tn-t1+1)),ncol=(tn-t1+1)))
#vectorized version of the first feature ^
return(fsp[t1:tn, f1:fn, ]) #this is the returned matrix
I assume that the var feature will be easy to vectorize after the 2nd feature is vectorized

Resources