How to calculate covariance using apply function in R? - r

I'm trying to simulate two distributions and compare their covariance using two functions. But when defining my covariance functions, I run into an arrow:
Error in apply(c(X, Y), 1, cov) : dim(X) must have a positive length
This is my code so far:
X <- matrix(rnorm(30,0,2))
Y <- matrix(rnorm(30,2,3))
S2 <- apply(c(X,Y), 1, cov)
Thank you in advance.

I am not really sure, what your final goal really is giving X and Y.
But maybe cov(cbind(X,Y)) is doing what you expect?

We need mapply() here.
mapply(cov, list(X, Y))

Try sapply(list(X, Y), cov), sapply() takes a list, vector as input and the output is a vector.

Related

Vectorize a two argument function

I have a covariance function type of two lags: h1 and h2. I am trying to avoid for loops to create the covariance function matrix.
When I type cov1 it does not give me a matrix. Just a vector if I type for example covmatrix(h1=1:5,h2=1:5). How can I obtain for example the whole 5 by 5 matrix.
I tried all apply functions, and the new vectorize function (with lower case v)
R code:
x=arima.sim(n = 100 , list(ar = .5))
cov=function(h1,h2){
(1/n)*sum((x[1:(n-h1-h2)]-mean(x))*(x[(1+h1):(n-h2)]-mean(x))*(x[(1+h1+h2):n]-mean(x)))
}
covmatrix=Vectorize(cov)
A simple double-apply should get you what you are looking for. Note how the return value of the vectorized function is equal to the diagonal of the covmatrix.
test <- sapply(1:5, function(x) sapply(1:5, function(y) cov(x, y)))
all.equal(diag(test), covmatrix(1:5, 1:5))

Two different covariance return for same formula in R

I was confused by covariance in R.
When I use E(x*y)-E(x)E(y), it returns a different value for cov().
Can you help me to understand it?
my code:
spot<- c(0.5,0.61,-0.22,-0.35,0.79,0.04,0.15,0.7,-0.51,-0.41)
future<- c(0.56,0.63,-0.12,-0.44,0.6,-0.06,0.01,0.8,-0.56,-0.46)
ms<-mean(spot)
mf<-mean(future)
msf<-mean(spot*future)
cov<- msf-mf*ms
#the way above is wrong for giving 0.22272 while cov gives 0.2474667
covr<- cov(spot,future)
I don't think you used the correct formula. The formula for covariance between two vectors X and Y each of length n is:
cov(X,Y) = sigma((X-mean(X))*(Y-mean(Y)))/(n-1)
spot<- c(0.5,0.61,-0.22,-0.35,0.79,0.04,0.15,0.7,-0.51,-0.41)
future<- c(0.56,0.63,-0.12,-0.44,0.6,-0.06,0.01,0.8,-0.56,-0.46)
covar = sum((spot-mean(spot))*(future-mean(future)))/(length(spot)-1)
#covar
#0.2474667

Problems with Gaussian Quadrature in R

I'm using the the gaussquad package to evaluate some integrals numerically.
I thought the ghermite.h.quadrature command worked by evaluating a function f(x) at points x1, ..., xn and then constructing the sum w1*f(x1) + ... + wn*f(xn), where x1, ..., xn and w1, ..., wn are nodes and weights supplied by the user.
Thus I thought the commands
ghermite.h.quadrature(f,rule)
sum(sapply(rule$x,f)*rule$w)
would yield the same output for any function f, where ''rule'' is a dataframe which stores the nodes in a column labelled ''x'' and the weights in a column labelled "w". For many functions the output is indeed the same, but for some functions I get very different results. Can someone please help me understand this discrepancy?
Thanks!
Code:
n.quad = 50
rule = hermite.h.quadrature.rules(n.quad)[[n.quad]]
f <- function(z){
f1 <- function(x,y) pnorm(x+y)
f2 <- function(y) ghermite.h.quadrature(f1,rule,y = y)
g <- function(x,y) x/(1+y) / f2(y)*pnorm(x+y)
h <- function(y) ghermite.h.quadrature(g,rule,y=y)
h(z)
}
ghermite.h.quadrature(f,rule)
sum(sapply(rule$x,f)*rule$w)
Ok, that problem got me interested.
I've looked into gaussquad sources, and clearly author is not running sapply internally, because all integrands/function shall return vector on vector argument.
It is clearly stated in documentation:
functn an R function which should take a numeric argument x and possibly some parameters.
The function returns a numerical vector value for the given argument x
In case where you're using some internal functions, they're written that way, so everything works.
You have to rewrite your function to work with vector argument and return back a vector
UPDATE
Vectorize() works for me to rectify the problem, as well as simple wrapper with sapply
vf <- function(z) {
sapply(z, f)
}
After either of those changes, results are identical: 0.2029512

Linear regression using a list of function

I've a dataset with X and Y value obtained from a calibration and I have to interpolate them with a predefined list of polynomial functions and choose the one with the best R2.
The most silly function should be
try<-function(X,Y){
f1<- x + I(x^2.0) - I(x^3.0)
f2<- x + I(x^1.5) - I(x^3.0)
...
f20<- I(x^2.0) - I(x^2.5) + I(x^0.5)
r1<- lm(y~f1)
r2<- lm(y~f2)
...
r20<-lm(y~f20)
v1<-summary(r1)$r.squared
v2<-summary(r2)$r.squared
...
v20<-summary(r20)$r.squared
v<-c(v1,v2,...,v20)
return(v)
}
I'd like then to make this function shorter and smarter (especially from the definition of r1 to the end). I'd also like to give the user the possibility to choose a function among f1 to f20 (typing the desired row number of v) and see the output of the function print and plot on it.
Please, could you help me?
Thank you.
#mso: the idea of using sapply is nice but unfortunately in this way I don't use a polynome for the regression: my x vector is transformed in the f1 vector according to the formula and then used for the regression. I obtain just one parameter instead of 3 (in this case).
Create F as a list and proceed:
F = list(f1, f2, ...., f20)
r = sapply(F, function(x) lm(y~x))
v = sapply(r, function(x) summary(x)$r.squared)
return v
sapply will take each element of F and perform lm with y and put results in vector r. In next line, sapply will take every element of r and get summary and put the results in the vector v. Hopefully, it should work. You could also try lapply (instead of sapply) which is very similar.

Integration of a vector return one value

I am using R to do some multivariate analysis. For this work I need to integrate the trivariate PDF.Since I want to use this in a MLE, a want a vector of integration. Is there a way to make Integratebring a vector instead of one value.
Here is simple example:
f1=function(x, y, z) {dmvnorm(x=as.matrix(cbind(x,y,z)), mean=c(0,0,0), sigma=sigma)}
f1(x=c(1,1,1), y=c(1,1,1), z=c(1,1,1))
integrate(Vectorize(function(x) {f1(x=c(1,1,1), y=c(1,1,1), z=c(1,1,1))}), lower = - Inf, upper = -1)$value
Error in integrate(Vectorize(function(x) { : evaluation of function gave a result of wrong length
To integrate a function of one variable, with vector values,
you can transform the function into n functions with real values,
and integrate each of them.
This is very inefficient (when integrating the i-th function,
I evaluate all the functions, and discard all but one value).
# Function to integrate
d <- rnorm(10)
f <- function(x) dnorm(d, mean=x)
# Integrate those n functions separately.
n <- length(f(1))
r <- sapply( 1:n,
function(i) integrate(
Vectorize(function(x) f(x)[i]),
lower=-Inf, upper=0
)$value
)
r
For 2-dimensional integrals, you can check pracma::integral2,
but the same manipulation (transforming a bivariate function with vector values
into n bivariate functions with real values) will probably be needed.

Resources