I've been searching the answer but didn't find any information about this function except the off. R docs.
If i want to calculate the values of 1-dimentional normal distribution in the same x with different means or standard deviations i'll just call
dnorm(x, mu, sigma)
where mu and sigma will be arrays with desired means and sigmas.
Is there any way to perform same trick with dmnorm function from mnormt module, when x and mu are vectors and sigma is a covariation matrix?
P.S.: Sorry for my English, thanks for answers.
In R the collections of functions are called "packages". If a function is not vectorized in its parameters, you can pass it one parameter as a vector with sapply or as a parallelized set of list with mapply. So you should consider the mathematical issue, especially that the 'mean' is no longer a single number but rather a vector, and that sigma (which dmnorm is calling 'varcov')is no longer a single number but rather a matrix. The first example in the help page gives you the densities of 21 different x,y,z's and a single mean vector and sigma matrix.
Using that example as a starting point, make a list of 7 x,y,x and 7 varying means and sigmas and then mapply it to the first 7 items in the xyz's :
x <- seq(-2,4)
y <- 2*x+10
z <- x+cos(y)
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
lsig <- lapply(seq(-2,4)/10, "+", Sigma); lmean<-lapply(seq(-2,4)/10, "+",mu)
mapply(dmnorm, x=as.data.frame(t(cbind(x,y,z)[1:7,])), mean=lmean, varcov=lsig)
# V1 V2 V3 V4 V5 V6 V7
# 6.177e-06 6.365e-04 5.364e-03 3.309e-02 2.205e-02 6.898e-03 1.077e-03
Related
I need to solve this optimization problem in order to estimate lambda:
Basically, I need to find the correlation between these two functions:
f1 <- function(lambda, tau){slope = (1-exp(-lambda*tau))/(lambda*tau)
return(slope)}
f2 <- function(lambda, tau){curve = ((1-exp(-lambda*tau))/(lambda*tau))-exp(-lambda*tau)
return(curve)}
I know the different values of tau. Suppose for example tau = 0.25: now f1 and f2 have only one missing parameter, lambda, which should be estimated. However, when I try to implement the optim() function to be minimized, it does not work since f1 and f2 are not numeric. How can I build this kind of optimization problem mantaining f1 and f2 as functions?
Many thanks
If I am understanding correctly, you are trying to minimise the squared correlation between the output of f1 and f2 at different values of lambda. This means that for each value of lambda you are assessing, you need to feed in the complete vector of tau values. This will give a vector output for each value of lambda so that a correlation between the output from the two functions can be calculated at any single value of lambda.
To do this, we create a vectorized function that takes lambda values and calculates the squared correlation between f1 and f2 at those values of lambda across all values of tau
f3 <- function(lambda) {
sapply(lambda, function(l) {
cor(f1(l, seq(1/12, 10, 1/12)), f2(l, seq(1/12, 10, 1/12)))^2
})
}
To get the optimal value of lambda that minimizes the squared correlation, we just use optimize:
optimize(f3, c(0, 100))$minimum
#> [1] 0.6678021
Perhaps the examples at the bottom of the page help: https://search.r-project.org/CRAN/refmans/NMOF/html/NSf.html
They input a vector of times into the functions (which is fixed for a given yield-curve), so you can compute the correlation for a given lambda. To minimize the correlation, do a grid search over the lambdas.
In your case, for instance,
lambda <- 2
cor(f1(lambda, 1:10), f2(lambda, 1:10))
Note that I have assumed maturity measured in years, 1 to 10. You will need to fill in appropriate values.
To find a lambda that leads to a low correlation, you could run a grid search.
lambdas <- seq(0.00001, 25, length.out = 1000)
squared.corr <- rep(NA_real_, length(lambdas))
for (i in seq_along(lambdas)) {
c <- cor(f1(lambdas[i], 1:10),
f2(lambdas[i], 1:10))
squared.corr[i] <- c*c
}
lambdas[which.min(c2)]
## [1] 0.490
(I am one of the authors of Gilli, Grosse and Schumann (2010), on which the suggestion to minimize the correlation is based.)
I'm trying to calculate the covariance of a matrix which has two colinear vectors. I have read that it was impossible with the "cov" function from R.
Does a different function exist on R to calculate the covariance of a matrix which has two colinear vectors (since it works on Matlab and Excel).
Thank you in advance for your answers
Please consider providing a reproducible example with sample of your data and the corresponding code. Broadly speaking, a covariance matrix can be created with use of the code below:
# Vectors
V1 <- c(1:4)
V2 <- c(4:8)
V3 <- runif(n = 4)
V4 <- runif(n = 4)
#create matrix
M <- cbind(V1,V2, V3, V4)
# Covariance
cov(M)
I'm guessing that you may be getting the following error:
number of rows of result is not a multiple of vector length (arg 1)
You could first try to use the cov function as discussed here.
I have data which I want to fit to the following equation using R:
Z(u,w)=z0*F(w)*[1-exp((-b*u)/F(w))]
where z0 and b are constants and F(w), w=0,...,9 is a decreasing step function that depends on w with F(0)=1 and u=1,...,50.
Z(u,w) is an observed set of data in the form of a 50x10 matrix (u=50,...,1 down the side of the rows and w=0,...,9 along the columns). For example as I haven't explained that great, Z(42,3) will be the element in the 9th row down and the 4th column along.
Using F(0)=1 I was able to get estimates of b and z0 using just the first column (ie w=0) with the code:
n0=nls(zuw~z0*(1-exp(-b*u)),start=list(z0=283,b=0.03),options(digits=10))
I then found F(w) for w=1,...,9 by going through each columns and using the vlaues of b and z0 I found.
However, I was wanting to find a way to estimate all the 12 parameters at once (b, z0 and the 10 values of F(w)) as b and z0 should be fitted to all the data, not just the first column.
Does anyone know of any way of doing this? All help would be greatly appreciated!
Thanks
James
This may be a case where the formula interface of the nls(...) function works against you. As an alternative, you can use nls.lm(...) in the minpack.lm package to perform non-linear regression with a programmatically defined function. To demonstrate this, first we create an artificial dataset which follows your functional form by design, with random error added (error ~ N[0,1]).
u <- 1:50
w <- 0:9
z0 <- 100
b <- 0.02
F <- 10/(10+w^2)
# matrix containing data, in OP's format: rows are u, cols are w
m <- do.call(cbind,lapply(w,function(w)
z0*F[w+1]*(1-exp(-b*u/F[w+1]))+rnorm(length(u),0,1)))
So now we have a matrix m, which is equivalent to your dataset. This matrix is in the so-called "wide" format - the response for different values of w is in different columns. We need it in "long" format: all responses in a single column, with a separate columns identifying u and w. We do this using melt(...) in the reshape2 package.
# prepend values of u
df.wide <- data.frame(u=u, m)
library(reshape2)
# reshape to long format: col1 = u, col2=w, col3=z
df <- melt(df.wide,id="u",variable.name="w", value.name="z")
df$w <- as.numeric(substr(df$w,2,4))-1
Now we have a data frame df with columns u, w, and z. The nls.lm(...) function takes (at least) 4 arguments: par is a vector of initial estimates of the parameters of the fit, fn is a function that calculates the residuals at each step, observed is the dependent variable (z), and xx is a vector or matrix containing the independent variables (u, v).
Next we define a function, f(par, xx), where par is an 11 element vector. The first two elements contain estimates of z0 and b. The next 9 contain estimates of F(w), w=1:9. This is because you state that F(0) is known to be 1. xx is a matrix with two columns: the values for u and w respectively. f(par,xx) then calculates estimate of the response z for all values of u and w, for the given parameter estimates.
library(minpack.lm)
# model function
f <- function(pars, xx) {
z0 <- pars[1]
b <- pars[2]
F <- c(1,pars[3:11])
u <- xx[,1]
w <- xx[,2]
z <- z0*F[w+1]*(1-exp(-b*u/F[w+1]))
return(z)
}
# residual function
resids <- function(p, observed, xx) {observed - f(p,xx)}
Next we perform the regression using nls.lm(...), which uses a highly robust fitting algorithm (Levenberg-Marquardt). Consequently, we can set the par argument (containing the initial estimates of z0, b, and F) to all 1's, which is fairly distant from the values used in creating the dataset (the "actual" values). nls.lm(...) returns a list with several components (see the documentation). The par component contains the final estimates of the fit parameters.
# initial parameter estimates; all 1's
par.start <- c(z0=1, b=1, rep(1,9))
# fit using Levenberg-Marquardt algorithm
nls.out <- nls.lm(par=par.start,
fn = resids, observed = df$z, xx = df[,c("u","w")],
control=nls.lm.control(maxiter=10000, ftol=1e-6, maxfev=1e6))
par.final <- nls.out$par
results <- rbind(predicted=c(par.final[1:2],1,par.final[3:11]),actual=c(z0,b,F))
print(results,digits=5)
# z0 b
# predicted 102.71 0.019337 1 0.90456 0.70788 0.51893 0.37804 0.27789 0.21204 0.16199 0.13131 0.10657
# actual 100.00 0.020000 1 0.90909 0.71429 0.52632 0.38462 0.28571 0.21739 0.16949 0.13514 0.10989
So the regression has done an excellent job at recovering the "actual" parameter values. Finally, we plot the results using ggplot just to make sure this is all correct. I can't overwmphasize how important it is to plot the final results.
df$pred <- f(par.final,df[,c("u","w")])
library(ggplot2)
ggplot(df,aes(x=u, color=factor(w)))+
geom_point(aes(y=z))+ geom_line(aes(y=pred))
I have been trying to apply this functions but I am having some problems.
For one variable(x) I have
mean <- rnorm(K,mean=mean(x),sd=sd(x))
sigma2 <- rep(sd(x),K)
for (k in 1:K)
{
f[,k] <- dnorm(x,mu[k],sigma2[k]) ##pdf ##
}
I want to do the same but now I have a matrix(T) with two variables x and y
Could somebody help me, please. I am new with R. Thanks
The mu object is not defined (and neither are K, or x, so I'm going to assume your brain skipped a beat and that you really wanted that mu to be what you called mean that you had defined one line earlier. I'm further going make them both named mu since naming objects by the function that creates them is a bad idea. Your for-loop is entirely unnecessary since dnorm is vectorized:
K= 100; x <- rnorm(10)
mu <- rnorm(K,mean=mean(x),sd=sd(x))
sigma2 <-sd(x)
f <- dnorm(x,mean=mu, sd=sigma2) ##pdf ##
str(f)
# num [1:100] 0.39342 0.42177 0.00906 0.38493 0.29362 ...
So now you know how to work with dnorm. Tp make it work with a matrix by column you can do this:
apply(T, 2, dnorm, mean=mu, sd=sigma2)
Your question title said dmvnorm but you code said dnorm, so if you wnat to use a multivariate density then you need to specify which package you are using and provide quite a bit more detail of what the goals are.
I am using R to do some multivariate analysis. For this work I need to integrate the trivariate PDF.Since I want to use this in a MLE, a want a vector of integration. Is there a way to make Integratebring a vector instead of one value.
Here is simple example:
f1=function(x, y, z) {dmvnorm(x=as.matrix(cbind(x,y,z)), mean=c(0,0,0), sigma=sigma)}
f1(x=c(1,1,1), y=c(1,1,1), z=c(1,1,1))
integrate(Vectorize(function(x) {f1(x=c(1,1,1), y=c(1,1,1), z=c(1,1,1))}), lower = - Inf, upper = -1)$value
Error in integrate(Vectorize(function(x) { : evaluation of function gave a result of wrong length
To integrate a function of one variable, with vector values,
you can transform the function into n functions with real values,
and integrate each of them.
This is very inefficient (when integrating the i-th function,
I evaluate all the functions, and discard all but one value).
# Function to integrate
d <- rnorm(10)
f <- function(x) dnorm(d, mean=x)
# Integrate those n functions separately.
n <- length(f(1))
r <- sapply( 1:n,
function(i) integrate(
Vectorize(function(x) f(x)[i]),
lower=-Inf, upper=0
)$value
)
r
For 2-dimensional integrals, you can check pracma::integral2,
but the same manipulation (transforming a bivariate function with vector values
into n bivariate functions with real values) will probably be needed.