MLE negative log-likelihood truncated normal in R

MLE negative log-likelihood truncated normal in R - r

I am a bit stuck with my code and i would greatly appreciate it if somebody could help me!
The code below is basically a function I've written that enables you to calculate the maximum likelihood estimates of μ and σ2 on the basis of an iid sample x1, …, xn from a truncated normal distribution with parameters μ, σ2 and τ, where the value of τ is known.
This is the code i have up to now.
tnorm.negll <- function(theta,xbar,SS,n,tau){
#storing variance into a more appropriate name
sigmasq <- theta[2]
logvar <- log(sigmasq)
if (sigmasq<0) {
warning("Input for variance is negative")
return(Inf)
}
#storing mean into a more appropriate name
mu <- theta[1]
#standard normal
phi <- function(x) exp(-x^2/2)/sqrt(2*pi)
#using the given formula for the negative log-likelihood
negloglike <- (n/2)*(logvar) + (SS - 2*n*mu*xbar + n*mu^2)/(2*sigmasq) + n*log(1-phi((tau-mu)/(sqrt(sigmasq))))
#returning value:
return(negloglike)
}
I now need to write another function, say trnorm.MLE(), that will use the function above to calculate the maximum likelihood estimates of μ and σ2 given a vector of observations x1, …, xn and a value of τ.
I decided that my function should have the following arguments:
x : vector of observations,
tau : value for threshold,
theta0 : vector with elements theta0[1] initial guess for mu and theta0[2] initial guess for sigmasq.
Ideally, the trnorm.MLE() function should return a vector of length 2, where the first component is the MLE of μ and the second component is the MLE of σ2.
As a guess , I've written this:
x <- rep(1:11, c(17,48,68,71,42,19,14,7,1,0,1))
tau <- 3
theta0 <- c(3,15)
xbar <- mean(x)
SS <- sum(x^2)
n <- length(x)
nlm(tnorm.negll,theta0,xbar,SS,n,tau,hessian = TRUE)
I know this is far from correct but I cannot express it correctly!
I get various errors
Error in nlm(tnorm.negll, theta0, xbar, SS, n, tau, hessian = TRUE) :
invalid function value in 'nlm' optimizer
or
Error in if (theta[2] >= 0) { : missing value where TRUE/FALSE needed
Thank you for reading this. Hopefully somebody can guide me through this?
Best Regards.
edit : changed how tnorm.negll returns its results

Related

nonlinear least square algorithm code with R

I have a question on minimizing the sum of squared residuals to estimate "theta" in the below regression function. I intend not to use any built-in functions or packages in R, and write the iterative algorithm.
The regression function is: y_k=exp(-theta |x_k|)+e_k, for k=1,...,n
Here is my code, but it gives me the following error for some sets of x and y. Thanks in advance for your suggestions!
Error in if (abs(dif) < 10^(-5)) break :
missing value where TRUE/FALSE needed"
Code:
theta <- -sum(log(abs(y)))/sum(abs(x))
#Alg:
rep <- 1
while(rep<=1000){
Ratio <- sum((abs(x)*exp(-theta*abs(x)))*(y-exp(-theta*abs(x))))/
sum((abs(x)^2*exp(-theta*abs(x)))*(y-2*exp(-theta*abs(x))))
if(is.na(Ratio)){
thetanew <- theta
}
else{
thetanew <- theta+Ratio
}
dif <- thetanew-theta
theta <- thetanew
if(abs(dif)<10^(-5)) break
rep=rep+1
}

How to fix code in RMarkdown for simulation that will not run due to "Error: Discrete value supplied to continuous scale"?

I am wondering what is wrong with my following R code (R markdown)? I keep getting an error message for the last line that says "Error in h(x.n, df = N - 2) : unused argument (df = N - 2)". I am very confused because my TA looked at my code and told me that it should run perfectly.
For context, this is the problem I am working on:
library(MASS)
library(tidyverse)
library(hypergeo)
set.seed(1)
rm(list=ls())
N=7
Nsim=10000
rho=0
Sigma=matrix(c(1,rho,rho,1),2,2)
Sigma
mu=c(0,0)
r_vec=matrix(NaN,nrow=1,ncol=Nsim)
#have function mvrnorm-->simulate from multivariate normal distribution. N=7 Correlation matrix sigma. before X was fixed but now is random and formal dependence from Y that I can control. Compute rho hat and see if on average it gives me correct rho. Check how serious bias is when the expected value of rho hat isn't equal to rho. I want a feeling about whether this is something I should worry about or not
for (i in 1:Nsim){
data=mvrnorm(N, mu, Sigma)
r_vec[i]=cor(data[,1],data[,2])
}
mean(r_vec)
update.packages("deSolve")
x.n=seq(-1,1,0.1)
sim_rho0<-function(Nsim,N,rho){
rho=rho
mu=c(0,0)
Sigma=matrix(c(1,rho,rho,1),nrow=2)
r_vec=matrix(NaN,nrow=Nsim)
for (i in 1:Nsim){
data=mvrnorm(N, mu, Sigma)
r_vec[i]=cor(data[,1],data[,2])
}
# here we compute t, which should have a t_{N-2} distribution. This is different here and trying to reconstruct the .Not a mathematical proof. Might be a mistake*****
#range of values and plotting density for each one
h<- function(N,rho,x.n){
rho=rho
a <- ((N-2)*(gamma(N-1))*(1-rho^2)^(N-1)/2*(1-x.n^2)^(N-4)/2)/((2*pi)*(sqrt(N-1/2))((1-x.n*rho)^(N-3/2)))
b <- hypergeo(1/2, 1/2, (2*N-1/2), ((x.n*rho)+1)/2)
h2 = a*b
return(h2)
}
t=r_vec*sqrt(N-2)/(1-r_vec^2)
x.n=seq(-1,1,0.1)
y.n= h(N=10, rho=0.8, x.n=x.n)
df=tibble(X=t)
df2=tibble(x=x.n,y=y.n)
ggplot()+geom_histogram(data=df, aes(x=X,y=..density..),binwidth=0.2,
color="black", fill="white")+ geom_line(data = df2, aes(x = x, y = y),
color = "red")+xlim(-5,5)
}
rho=0.8
Nsim=3000
N=10
sim_rho0(Nsim,N,rho)

You've defined that the function h has the arguments N, rho and x.n. Then you try to call it with the argument df which h does not have, therefore you get the error. You need to call h with the correct arguments (i.e. also don't leave out N and rho, and if the value x.n should be passed to the function argument x.n, you need to specify it (don't use a positional argument). I also recommend to follow a style guide, e.g. https://style.tidyverse.org/

R: conditional expected value

Hello everybody (this is my first post in here)!
I'm having a problem with finding the conditional expected value for a given distribution.
Suppose that we need to find E( x | x>0.5 ), where x has gev (generalised extreme value) distribution, with density dgev(x, xi, sigma, mu). What I was trying to do was
library(evir)
func1 <- function(x) {x*dgev(x, xi, sigma, mu)}
integral <- integrate(func1, lower = 0.5, upper = 10000, subdivisions = 10000)
cond.exp.val <- as.numeric(integral[1])/(1-q)
where q is the value that gives qgev(q, xi, sigma, mu) = 0.5, used for normalisation.
The result greatly depends on the 'upper' parameter of integrate() function and for higher values of this parameter the integral diverges. As my distribution parameters are
xi <- 0.81
sigma <- 0.0067
mu <- 0.0072
this integration should be feasible and convergent. Do you have any ideas what I am doing wrong or is there any built-in R function that may calculate such conditional expected value?

Generally, you are advised to use Inf rather than a large number when integrating the right tail of a density. See details in ?integrate. I took your description of q as being a value obtained by iteration and I stopped when I got within 4 decimal places of 0.5 using q <- 0.99315:
qgev(.99315, xi, sigma, mu)
[1] 0.4998413
You also incorrectly used the extraction from your integral variable. Should use either "[[" or "$" when working with lists:
func1 <- function(x) {x*dgev(x, xi, sigma, mu)}
integral <- integrate(func1, lower = 0.5, upper = Inf, subdivisions = 10000)
(cond.exp.val <- integral[[1]]/(1-.99315)) # `as.numeric` not needed
#[1] 2.646068
I have concerns that your description of how to get q was misleading, since values above 1 should not be an expectation derived from a statistical PDF.

Writing a proper normal log-likelihood in R

I have a problem regarding the following model,
where I want to make inference on μ and tau, u is a known vector and x is the data vector. The log-likelihood is
I have a problem writing a log-likelihood in R.
x <- c(3.3569,1.9247,3.6156,1.8446,2.2196,6.8194,2.0820,4.1293,0.3609,2.6197)
mu <- seq(0,10,length=1000)
normal.lik1<-function(theta,x){
u <- c(1,3,0.5,0.2,2,1.7,0.4,1.2,1.1,0.7)
mu<-theta[1]
tau<-theta[2]
n<-length(x)
logl <- sapply(c(mu,tau),function(mu,tau){logl<- -0.5*n*log(2*pi) -0.5*n*log(tau^2+u^2)- (1/(2*tau^2+u^2))*sum((x-mu)^2) } )
return(logl)
}
#test if it works for mu=1, tau=2
head(normal.lik1(c(1,2),x))
#Does not work..
I want to be able to plug in the vector for mu and plot it over mu for a fixed value of tau, say 2. I also want to find out the MLE's of tau and mu using the optim function. I tried:
theta.hat<-optim(c(1,1),loglike2,control=list(fnscale=-1),x=x,,method="BFGS")$par
But it does not work.. Any suggestions to how I can write the likelihood?

First, as has been mentioned in the comments to your question, there is no need to use sapply(). You can simply use sum() – just as in the formula of the logLikelihood.
I changed this part in normal.lik1() and multiplied the expression that is assigned to logl by minus 1 such that the function computes the minus logLikelihood. You want to search for the minimum over theta since the function returns positive values.
x < c(3.3569,1.9247,3.6156,1.8446,2.2196,6.8194,2.0820,4.1293,0.3609,2.6197)
u <- c(1,3,0.5,0.2,2,1.7,0.4,1.2,1.1,0.7)
normal.lik1 <- function(theta,x,u){
mu <- theta[1]
tau <- theta[2]
n <- length(x)
logl <- - n/2 * log(2*pi) - 1/2 * sum(log(tau^2+u^2)) - 1/2 * sum((x-mu)^2/(tau^2+u^2))
return(-logl)
}
This can be done using nlm(), for example
nlm(normal.lik1, c(0,1), hessian=TRUE, x=x,u=u)$estimate
where c(0,1) are the starting values for the algorithm.
To plot the logLikelihood for a range of values of mu and some fixed tau you can adjust the function such that mu and tau are separate numeric arguments.
normal.lik2 <- function(mu,tau,x,u){
n <- length(x)
logl <- - n/2 * log(2*pi) - 1/2 * sum(log(tau^2+u^2)) - 1/2 * sum((x-mu)^2/(tau^2+u^2))
return(logl)
}
Then define some range for mu, compute the loglikelihood and use plot().
range.mu <- seq(-10,20,0.1)
loglik <- sapply(range.mu, function(m) normal.lik2(mu=m,tau=2,x=x,u=u))
plot(range.mu, loglik, type = "l")
I'm sure there are more elegant ways to do this but this does the trick.

Using JAGS or STAN when an observed node is the max of latent nodes

I have the following latent variable model: Person j has two latent variables, Xj1 and Xj2. The only thing we get to observe is their maximum, Yj = max(Xj1, Xj2). The latent variables are bivariate normal; they each have mean mu, variance sigma2, and their correlation is rho. I want to estimate the three parameters (mu, sigma2, rho) using only Yj, with data from n patients, j = 1,...,n.
I've tried to fit this model in JAGS (so I'm putting priors on the parameters), but I can't get the code to compile. Here's the R code I'm using to call JAGS. First I generate the data (both latent and observed variables), given some true values of the parameters:
# true parameter values
mu <- 3
sigma2 <- 2
rho <- 0.7
# generate data
n <- 100
Sigma <- sigma2 * matrix(c(1, rho, rho, 1), ncol=2)
X <- MASS::mvrnorm(n, c(mu,mu), Sigma) # n-by-2 matrix
Y <- apply(X, 1, max)
Then I define the JAGS model, and write a little function to run the JAGS sampler and return the samples:
# JAGS model code
model.text <- '
model {
for (i in 1:n) {
Y[i] <- max(X[i,1], X[i,2]) # Ack!
X[i,1:2] ~ dmnorm(X_mean, X_prec)
}
# mean vector and precision matrix for X[i,1:2]
X_mean <- c(mu, mu)
X_prec[1,1] <- 1 / (sigma2*(1-rho^2))
X_prec[2,1] <- -rho / (sigma2*(1-rho^2))
X_prec[1,2] <- X_prec[2,1]
X_prec[2,2] <- X_prec[1,1]
mu ~ dnorm(0, 1)
sigma2 <- 1 / tau
tau ~ dgamma(2, 1)
rho ~ dbeta(2, 2)
}
'
# run JAGS code. If latent=FALSE, remove the line defining Y[i] from the JAGS model
fit.jags <- function(latent=TRUE, data, n.adapt=1000, n.burnin, n.samp) {
require(rjags)
if (!latent)
model.text <- sub('\n *Y.*?\n', '\n', model.text)
textCon <- textConnection(model.text)
fit <- jags.model(textCon, data, n.adapt=n.adapt)
close(textCon)
update(fit, n.iter=n.burnin)
coda.samples(fit, variable.names=c("mu","sigma2","rho"), n.iter=n.samp)[[1]]
}
Finally, I call JAGS, feeding it only the observed data:
samp1 <- fit.jags(latent=TRUE, data=list(n=n, Y=Y), n.burnin=1000, n.samp=2000)
Sadly this results in an error message: "Y[1] is a logical node and cannot be observed". JAGS does not like me using "<-" to assign a value to Y[i] (I denote the offending line with an "Ack!"). I understand the complaint, but I'm not sure how to rewrite the model code to fix this.
Also, to demonstrate that everything else (besides the "Ack!" line) is fine, I run the model again, but this time I feed it the X data, pretending that it's actually observed. This runs perfectly and I get good estimates of the parameters:
samp2 <- fit.jags(latent=FALSE, data=list(n=n, X=X), n.burnin=1000, n.samp=2000)
colMeans(samp2)
If you can find a way to program this model in STAN instead of JAGS, that would be fine with me.

Theoretically you can implement a model like this in JAGS using the dsum distribution (which in this case uses a bit of a hack as you are modelling the maximum and not the sum of the two variables). But the following code does compile and run (although it does not 'work' in any real sense - see later):
set.seed(2017-02-08)
# true parameter values
mu <- 3
sigma2 <- 2
rho <- 0.7
# generate data
n <- 100
Sigma <- sigma2 * matrix(c(1, rho, rho, 1), ncol=2)
X <- MASS::mvrnorm(n, c(mu,mu), Sigma) # n-by-2 matrix
Y <- apply(X, 1, max)
model.text <- '
model {
for (i in 1:n) {
Y[i] ~ dsum(max_X[i])
max_X[i] <- max(X[i,1], X[i,2])
X[i,1:2] ~ dmnorm(X_mean, X_prec)
ranks[i,1:2] <- rank(X[i,1:2])
chosen[i] <- ranks[i,2]
}
# mean vector and precision matrix for X[i,1:2]
X_mean <- c(mu, mu)
X_prec[1,1] <- 1 / (sigma2*(1-rho^2))
X_prec[2,1] <- -rho / (sigma2*(1-rho^2))
X_prec[1,2] <- X_prec[2,1]
X_prec[2,2] <- X_prec[1,1]
mu ~ dnorm(0, 1)
sigma2 <- 1 / tau
tau ~ dgamma(2, 1)
rho ~ dbeta(2, 2)
#data# n, Y
#monitor# mu, sigma2, rho, tau, chosen[1:10]
#inits# X
}
'
library('runjags')
results <- run.jags(model.text)
results
plot(results)
Two things to note:
JAGS isn't smart enough to initialise the matrix of X while satisfying the dsum(max(X[i,])) constraint on its own - so we have to initialise X for JAGS using sensible values. In this case I'm using the simulated values which is cheating - the answer you get is highly dependent on the choice of initial values for X, and in the real world you won't have the simulated values to fall back on.
The max() constraint causes problems to which I can't think of a solution within a general framework: unlike the usual dsum constraint that allows one parameter to decrease while the other increases and therefore both parameters are used at all times, the min() value of X[i,] is ignored and the sampler is therefore free to do as it pleases. This will very very rarely (i.e. never) lead to values of min(X[i,]) that happen to be identical to Y[i], which is the condition required for the sampler to 'switch' between the two X[i,]. So switching never happens, and the X[] that were chosen at initialisation to be the maxima stay as the maxima - I have added a trace parameter 'chosen' which illustrates this.
As far as I can see the other potential solutions to the 'how do I code this' question will fall into essentially the same non-mixing trap which I think is a fundamental problem here (although I might be wrong and would very much welcome working BUGS/JAGS/Stan code that illustrates otherwise).
Solutions to the failure to mix are harder, although something akin to the Carlin & Chibb method for model selection may work (force a min(pseudo_X) parameter to be equal to Y to encourage switching). This is likely to be tricky to get working, but if you can get help from someone with a reasonable amount of experience with BUGS/JAGS you could try it - see:
Carlin, B.P., Chib, S., 1995. Bayesian model choice via Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B 57, 473–484.
Alternatively, you could try thinking about the problem slightly differently and model X directly as a matrix with the first column all missing and the second column all equal to Y. You could then use dinterval() to set a constraint on the missing values that they must be lower than the corresponding maximum. I'm not sure how well this would work in terms of estimating mu/sigma2/rho but it might be worth a try.
By the way, I realise that this doesn't necessarily answer your question but I think it is a useful example of the difference between 'is it codeable' and 'is it workable'.
Matt
ps. A much smarter solution would be to consider the distribution of the maximum of two normal variates directly - I am not sure if such a distribution exists, but it it does and you can get a PDF for it then the distribution could be coded directly using the zeros/ones trick without having to consider the value of the minimum at all.

I believe you can model this in the Stan language treating the likelihood as a two component mixture with equal weights. The Stan code could look like
data {
int<lower=1> N;
vector[N] Y;
}
parameters {
vector<upper=0>[2] diff[N];
real mu;
real<lower=0> sigma;
real<lower=-1,upper=1> rho;
}
model {
vector[2] case_1[N];
vector[2] case_2[N];
vector[2] mu_vec;
matrix[2,2] Sigma;
for (n in 1:N) {
case_1[n][1] = Y[n]; case_1[n][2] = Y[n] + diff[n][1];
case_2[n][2] = Y[n]; case_2[n][1] = Y[n] + diff[n][2];
}
mu_vec[1] = mu; mu_vec[2] = mu;
Sigma[1,1] = square(sigma);
Sigma[2,2] = Sigma[1,1];
Sigma[1,2] = Sigma[1,1] * rho;
Sigma[2,1] = Sigma[1,2];
// log-likelihood
target += log_mix(0.5, multi_normal_lpdf(case_1 | mu_vec, Sigma),
multi_normal_lpdf(case_2 | mu_vec, Sigma));
// insert priors on mu, sigma, and rho
}

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

MLE negative log-likelihood truncated normal in R - r

Related

nonlinear least square algorithm code with R

How to fix code in RMarkdown for simulation that will not run due to "Error: Discrete value supplied to continuous scale"?

R: conditional expected value

Writing a proper normal log-likelihood in R

Using JAGS or STAN when an observed node is the max of latent nodes

Categories

Resources