Vectorize a two argument function - r

I have a covariance function type of two lags: h1 and h2. I am trying to avoid for loops to create the covariance function matrix.
When I type cov1 it does not give me a matrix. Just a vector if I type for example covmatrix(h1=1:5,h2=1:5). How can I obtain for example the whole 5 by 5 matrix.
I tried all apply functions, and the new vectorize function (with lower case v)
R code:
x=arima.sim(n = 100 , list(ar = .5))
cov=function(h1,h2){
(1/n)*sum((x[1:(n-h1-h2)]-mean(x))*(x[(1+h1):(n-h2)]-mean(x))*(x[(1+h1+h2):n]-mean(x)))
}
covmatrix=Vectorize(cov)

A simple double-apply should get you what you are looking for. Note how the return value of the vectorized function is equal to the diagonal of the covmatrix.
test <- sapply(1:5, function(x) sapply(1:5, function(y) cov(x, y)))
all.equal(diag(test), covmatrix(1:5, 1:5))

Related

How to calculate covariance using apply function in R?

I'm trying to simulate two distributions and compare their covariance using two functions. But when defining my covariance functions, I run into an arrow:
Error in apply(c(X, Y), 1, cov) : dim(X) must have a positive length
This is my code so far:
X <- matrix(rnorm(30,0,2))
Y <- matrix(rnorm(30,2,3))
S2 <- apply(c(X,Y), 1, cov)
Thank you in advance.
I am not really sure, what your final goal really is giving X and Y.
But maybe cov(cbind(X,Y)) is doing what you expect?
We need mapply() here.
mapply(cov, list(X, Y))
Try sapply(list(X, Y), cov), sapply() takes a list, vector as input and the output is a vector.

Bootstrap function for dataframe - passing a function as an argument in R

I am trying to create a bootstrap function for my assignment. The requirement is as follows:
Compute the bootstrap standard error for: - mean() and -
median() and - the top quartile and - max() and - the
standard deviation of the price. One way to approach this is to define
a new function for each. Another is to write a bootstrap_func
function that takes an additional argument called fun, and then you
call it bootstrap_func(B, v, median) to have the same effect as
bootstrap_median. Implement this function bootstrap_func.
Example call to this function: bootstrap_func(1000, vienna_data$price, mean). Generalize the function further so that the
second argument ($v$) can be a vector or a dataframe. Therefore, the
third argument can be a function that takes a vector -- such as mean
-- or a function that takes a dataframe and returns some number -- such as a function that computes a linear model and returns the
estimate of the linear model. Use this new function to compute
bootstrap estimators for the standard errors of some linear model
coefficients on the vienna dataset -- e.g. the effect of stars on
prices. You have to define and name a function that returns the
coefficient of the right linear model (say estimate_of_stars_on_prices <- ...), and pass this function as one
of the arguments to bootstrap_func.
I created the bootstrap function for the vector like this
sim <- function(v) {
sample(v, replace = TRUE)
}
bootstrap_func <- function(B, v, fun) {
sd(replicate(B, fun(sim(v))))
}
quartile <- function(x) {quantile(x, 0.75)}
So I can call an example like this
bootstrap_func(100, hotels_vienna$price, mean)
bootstrap_func(100, hotels_vienna$price, quartile)
And I think it works fine enough. But I have trouble generalizing it to take also the dataframe and the function that gets the coefficient. My function to get the coefficient is
coef <- function(v, y, x) {
Y <- v[,y]
X <- v[,x]
lmm <- lm(Y ~ X, v)
lmm$coefficients[[2]]
}
coef(hotels_vienna, 2, 12) # this works, col2 = price, col12= distance, result = -22.78177
This is my attempt at the generalized code
df_bootstrap_func <- function(B, v, fun, ...) {
new_v <- function(v) {sample(v, replace = TRUE)}
sd(replicate(B, fun(new_v)))
}
df_bootstrap_func(100, hotels_vienna, coef)
# does not work, throw Error in v[, y] : object of type 'closure' is not subsettable
I have tried multiple versions of the df_bootstrap_func but no success, so I think I need a new approach to the coefficient function. I appreciate any input. TIA.

Problems with Gaussian Quadrature in R

I'm using the the gaussquad package to evaluate some integrals numerically.
I thought the ghermite.h.quadrature command worked by evaluating a function f(x) at points x1, ..., xn and then constructing the sum w1*f(x1) + ... + wn*f(xn), where x1, ..., xn and w1, ..., wn are nodes and weights supplied by the user.
Thus I thought the commands
ghermite.h.quadrature(f,rule)
sum(sapply(rule$x,f)*rule$w)
would yield the same output for any function f, where ''rule'' is a dataframe which stores the nodes in a column labelled ''x'' and the weights in a column labelled "w". For many functions the output is indeed the same, but for some functions I get very different results. Can someone please help me understand this discrepancy?
Thanks!
Code:
n.quad = 50
rule = hermite.h.quadrature.rules(n.quad)[[n.quad]]
f <- function(z){
f1 <- function(x,y) pnorm(x+y)
f2 <- function(y) ghermite.h.quadrature(f1,rule,y = y)
g <- function(x,y) x/(1+y) / f2(y)*pnorm(x+y)
h <- function(y) ghermite.h.quadrature(g,rule,y=y)
h(z)
}
ghermite.h.quadrature(f,rule)
sum(sapply(rule$x,f)*rule$w)
Ok, that problem got me interested.
I've looked into gaussquad sources, and clearly author is not running sapply internally, because all integrands/function shall return vector on vector argument.
It is clearly stated in documentation:
functn an R function which should take a numeric argument x and possibly some parameters.
The function returns a numerical vector value for the given argument x
In case where you're using some internal functions, they're written that way, so everything works.
You have to rewrite your function to work with vector argument and return back a vector
UPDATE
Vectorize() works for me to rectify the problem, as well as simple wrapper with sapply
vf <- function(z) {
sapply(z, f)
}
After either of those changes, results are identical: 0.2029512

Linear regression using a list of function

I've a dataset with X and Y value obtained from a calibration and I have to interpolate them with a predefined list of polynomial functions and choose the one with the best R2.
The most silly function should be
try<-function(X,Y){
f1<- x + I(x^2.0) - I(x^3.0)
f2<- x + I(x^1.5) - I(x^3.0)
...
f20<- I(x^2.0) - I(x^2.5) + I(x^0.5)
r1<- lm(y~f1)
r2<- lm(y~f2)
...
r20<-lm(y~f20)
v1<-summary(r1)$r.squared
v2<-summary(r2)$r.squared
...
v20<-summary(r20)$r.squared
v<-c(v1,v2,...,v20)
return(v)
}
I'd like then to make this function shorter and smarter (especially from the definition of r1 to the end). I'd also like to give the user the possibility to choose a function among f1 to f20 (typing the desired row number of v) and see the output of the function print and plot on it.
Please, could you help me?
Thank you.
#mso: the idea of using sapply is nice but unfortunately in this way I don't use a polynome for the regression: my x vector is transformed in the f1 vector according to the formula and then used for the regression. I obtain just one parameter instead of 3 (in this case).
Create F as a list and proceed:
F = list(f1, f2, ...., f20)
r = sapply(F, function(x) lm(y~x))
v = sapply(r, function(x) summary(x)$r.squared)
return v
sapply will take each element of F and perform lm with y and put results in vector r. In next line, sapply will take every element of r and get summary and put the results in the vector v. Hopefully, it should work. You could also try lapply (instead of sapply) which is very similar.

Integration of a vector return one value

I am using R to do some multivariate analysis. For this work I need to integrate the trivariate PDF.Since I want to use this in a MLE, a want a vector of integration. Is there a way to make Integratebring a vector instead of one value.
Here is simple example:
f1=function(x, y, z) {dmvnorm(x=as.matrix(cbind(x,y,z)), mean=c(0,0,0), sigma=sigma)}
f1(x=c(1,1,1), y=c(1,1,1), z=c(1,1,1))
integrate(Vectorize(function(x) {f1(x=c(1,1,1), y=c(1,1,1), z=c(1,1,1))}), lower = - Inf, upper = -1)$value
Error in integrate(Vectorize(function(x) { : evaluation of function gave a result of wrong length
To integrate a function of one variable, with vector values,
you can transform the function into n functions with real values,
and integrate each of them.
This is very inefficient (when integrating the i-th function,
I evaluate all the functions, and discard all but one value).
# Function to integrate
d <- rnorm(10)
f <- function(x) dnorm(d, mean=x)
# Integrate those n functions separately.
n <- length(f(1))
r <- sapply( 1:n,
function(i) integrate(
Vectorize(function(x) f(x)[i]),
lower=-Inf, upper=0
)$value
)
r
For 2-dimensional integrals, you can check pracma::integral2,
but the same manipulation (transforming a bivariate function with vector values
into n bivariate functions with real values) will probably be needed.

Resources