Writing a proper normal log-likelihood in R - r

I have a problem regarding the following model,
where I want to make inference on μ and tau, u is a known vector and x is the data vector. The log-likelihood is
I have a problem writing a log-likelihood in R.
x <- c(3.3569,1.9247,3.6156,1.8446,2.2196,6.8194,2.0820,4.1293,0.3609,2.6197)
mu <- seq(0,10,length=1000)
normal.lik1<-function(theta,x){
u <- c(1,3,0.5,0.2,2,1.7,0.4,1.2,1.1,0.7)
mu<-theta[1]
tau<-theta[2]
n<-length(x)
logl <- sapply(c(mu,tau),function(mu,tau){logl<- -0.5*n*log(2*pi) -0.5*n*log(tau^2+u^2)- (1/(2*tau^2+u^2))*sum((x-mu)^2) } )
return(logl)
}
#test if it works for mu=1, tau=2
head(normal.lik1(c(1,2),x))
#Does not work..
I want to be able to plug in the vector for mu and plot it over mu for a fixed value of tau, say 2. I also want to find out the MLE's of tau and mu using the optim function. I tried:
theta.hat<-optim(c(1,1),loglike2,control=list(fnscale=-1),x=x,,method="BFGS")$par
But it does not work.. Any suggestions to how I can write the likelihood?

First, as has been mentioned in the comments to your question, there is no need to use sapply(). You can simply use sum() – just as in the formula of the logLikelihood.
I changed this part in normal.lik1() and multiplied the expression that is assigned to logl by minus 1 such that the function computes the minus logLikelihood. You want to search for the minimum over theta since the function returns positive values.
x < c(3.3569,1.9247,3.6156,1.8446,2.2196,6.8194,2.0820,4.1293,0.3609,2.6197)
u <- c(1,3,0.5,0.2,2,1.7,0.4,1.2,1.1,0.7)
normal.lik1 <- function(theta,x,u){
mu <- theta[1]
tau <- theta[2]
n <- length(x)
logl <- - n/2 * log(2*pi) - 1/2 * sum(log(tau^2+u^2)) - 1/2 * sum((x-mu)^2/(tau^2+u^2))
return(-logl)
}
This can be done using nlm(), for example
nlm(normal.lik1, c(0,1), hessian=TRUE, x=x,u=u)$estimate
where c(0,1) are the starting values for the algorithm.
To plot the logLikelihood for a range of values of mu and some fixed tau you can adjust the function such that mu and tau are separate numeric arguments.
normal.lik2 <- function(mu,tau,x,u){
n <- length(x)
logl <- - n/2 * log(2*pi) - 1/2 * sum(log(tau^2+u^2)) - 1/2 * sum((x-mu)^2/(tau^2+u^2))
return(logl)
}
Then define some range for mu, compute the loglikelihood and use plot().
range.mu <- seq(-10,20,0.1)
loglik <- sapply(range.mu, function(m) normal.lik2(mu=m,tau=2,x=x,u=u))
plot(range.mu, loglik, type = "l")
I'm sure there are more elegant ways to do this but this does the trick.

Related

calculate density of multivariate normal distribution manually

I want to calculate the density of a multivariate normal distribution manually. As inputs of my function, I have x which is a n*p matrix of data points, a vector mu with n means and a covariance matrix sigma of dim p*p.
I wrote the following function for this:
`dmnorm <- function(mu, sigma, x){
k <- ncol(sigma)
x <- t(x)
dmn <- exp((-1/2)*t(x-mu)%*%solve(sigma)%*%(x-
mu))/sqrt(((2*pi)^k)*det(sigma))
return(dmn)
}`
My own function gives me a matrix of n*n. However, I should get a vector of length n.
In the end, I want the same results as I get from using the dmvnorm() function from the mvtnorm package. What's wrong with my code?
The expression t(x-mu)%*%solve(sigma)%*%(x-
mu) is p x p, so that's why your result is that size. You want the diagonal of that matrix, which you can get using
diag(t(x-mu)%*%solve(sigma)%*%(x-mu))
so the full function should be
dmnorm <- function(mu, sigma, x){
k <- ncol(sigma)
x <- t(x)
dmn <- exp((-1/2)*diag(t(x-mu)%*%solve(sigma)%*%(x-
mu)))/sqrt(((2*pi)^k)*det(sigma))
dmn
}

D function returning zero

I am attempting to differentiate the log likelihood function of a multivariate normal distribution,
NLLmvnorm <- function(data, mu, sigma) {
a <- qr(sigma)
logdet <- sum(log(abs(diag(a$qr))))
sigma.inv <- fast.ginv(sigma)
-0.5 * (logdet + t(data-mu) %*% sigma.inv %*% (data-mu))
}
Using D, I differentiate with respect to mu and sigma and store each result in a separate variable:
du <- D(expression(NLLmvnorm),"mu")
ds <- D(expression(NLLmvnorm),"sigma")
However, both of these results return 0. Am I doing something incorrect, or am I misinterpreting what D does? If it helps, my goal is to find the roots of du and ds to find maximum likelihood using solve.

I would like to code up an equation in R that calculates Ridge Regression Criteria

The criteria is determined by the following function in the attached image.
I have been working at it for a while now but this is what I have. I need to
ridge.crit <- function(y, X, lambda, beta){
n <- length(y) #number of rows
p <- ncol(X) #number of predictor variables
for(i in 1:n){
(y[i] - beta[1] - sum(beta[2:p+1] * X[i,]))^2
##WE GOTTA ADD THEM TOGETHER AND STORE FINAL ANSWER AND RETURN IT!
}
return(final_answer)
}

MLE negative log-likelihood truncated normal in R

I am a bit stuck with my code and i would greatly appreciate it if somebody could help me!
The code below is basically a function I've written that enables you to calculate the maximum likelihood estimates of μ and σ2 on the basis of an iid sample x1, …, xn from a truncated normal distribution with parameters μ, σ2 and τ, where the value of τ is known.
This is the code i have up to now.
tnorm.negll <- function(theta,xbar,SS,n,tau){
#storing variance into a more appropriate name
sigmasq <- theta[2]
logvar <- log(sigmasq)
if (sigmasq<0) {
warning("Input for variance is negative")
return(Inf)
}
#storing mean into a more appropriate name
mu <- theta[1]
#standard normal
phi <- function(x) exp(-x^2/2)/sqrt(2*pi)
#using the given formula for the negative log-likelihood
negloglike <- (n/2)*(logvar) + (SS - 2*n*mu*xbar + n*mu^2)/(2*sigmasq) + n*log(1-phi((tau-mu)/(sqrt(sigmasq))))
#returning value:
return(negloglike)
}
I now need to write another function, say trnorm.MLE(), that will use the function above to calculate the maximum likelihood estimates of μ and σ2 given a vector of observations x1, …, xn and a value of τ.
I decided that my function should have the following arguments:
x : vector of observations,
tau : value for threshold,
theta0 : vector with elements theta0[1] initial guess for mu and theta0[2] initial guess for sigmasq.
Ideally, the trnorm.MLE() function should return a vector of length 2, where the first component is the MLE of μ and the second component is the MLE of σ2.
As a guess , I've written this:
x <- rep(1:11, c(17,48,68,71,42,19,14,7,1,0,1))
tau <- 3
theta0 <- c(3,15)
xbar <- mean(x)
SS <- sum(x^2)
n <- length(x)
nlm(tnorm.negll,theta0,xbar,SS,n,tau,hessian = TRUE)
I know this is far from correct but I cannot express it correctly!
I get various errors
Error in nlm(tnorm.negll, theta0, xbar, SS, n, tau, hessian = TRUE) :
invalid function value in 'nlm' optimizer
or
Error in if (theta[2] >= 0) { : missing value where TRUE/FALSE needed
Thank you for reading this. Hopefully somebody can guide me through this?
Best Regards.
edit : changed how tnorm.negll returns its results

How to put mathematical constraints with GenSA function in R

I am currently trying to use Simulated Annealing package GenSA in order to minimize the function below :
efficientFunction <- function(v) {
t(v) %*% Cov_Mat %*% v
}
Where Cov_Mat is a covariance matrix obtained from 4 assets and v is a weight vector of dimension 4.
I'm trying to solve the Markowitz asset allocation approach this way and I would like to know how I could introduce mathematical constraint such as the sum of all coefficients have to equal 1 :
sum(v) = 1
Moreover since I intend to rely on the GenSA function, I would like to use something like this with the constraint :
v <- c(0.25, 0.25, 0.25, 0.25)
dimension <- 4
lower <- rep(0, dimension)
upper <- rep(1, dimension)
out <- GenSA(v, lower = lower, upper = upper, fn = efficientFunction)
I have found in this paper : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.6091&rep=rep1&type=pdf
how to handle such constraint within the Simulated Annealing Algorithm but I don't know how I could implement it in R.
I'd be very grateful for any advice. It is my first time using SO so don't hesitate to tell me if I have the wrong approach in the way I ask question.
A possible approach would be to make use of so-called Lagrange multipliers (cf., http://en.wikipedia.org/wiki/Lagrange_multiplier). For example, set
efficientFunction <- function(v) {
lambda <- 100
t(v) %*% Cov_Mat %*% v + lambda * abs( sum(v) - 1 )
}
, so that in order to minimize the objective function efficientFunction the resulting parameter also minimize the penalty term lambda * abs( sum(v) - 1 ). The Lagrange multiplier lambda is set to an arbitrary but sufficiently high level.
So the function itself doesn't appear to have any constraints that you can set. However, you can reparameterize your function to force the constraint. How about
efficientFunction <- function(v) {
v <- v/sum(v)
t(v) %*% Cov_Mat %*% v
}
Here we normalize the values of v so that they will sum to 1. Then, when we get the output parameters, we need to perform the same transformation
out <- GenSA(v, lower = lower, upper = upper, fn = efficientFunction)
out$par/sum(out$par)

Resources