Problem with numerical integration and optim R - r

I try to optimize a tricky density which involves a combination of integrate and optim in R for some interest parameters. My question is more about code than statistics, that's why I post here.
I made some researchs on Internet and I didn't find anything. So I tried to make some non-convinced tries. I would like to estimate my parameters Beta without b impact.
I've had some differents errors about integrate or optim.
Here a example of what I'm trying to do.
X <- matrix(c(1,1,1,1,1,56,54,32,12,9), nrow=5, ncol=2)
y <- matrix(c(0,1,1,1,0), nrow=5, ncol=1)
f <- function(beta){
g <- function(X,y,b){
(1/(1 + exp(-(X%*%beta + b))))^y - (1-(1/(1 + exp(-(X%*%beta + b)))))^(1-y)
}
integrate(Vectorize(g), lower = 0, upper = Inf,X=X, y=y)
}
optim(par=c(1,0), f, method="BFGS", hessian=TRUE)
I would like an estimate for my beta parameters with optim package.
I work on it since 1 week and I'm really struggle to have some estimates for my 2 parameters beta0 and beta1.
Different approaches for this estimation, like EM algorithm or Gauss-Hermite Quadrature are welcome.
Thanks for any help.
Loïc.

Related

Using optim to choose initial values for nls

One method I have seen in the literature is the use of optim() to choose initial values for nonlinear models in the package nls or nlme, however, I am puzzled by the actual implementation.
Take an example using COVID data from Alachua, FL:
dat=data.frame(x=seq(1,10,1), y=c(27.9,23.1,24.6,33.0,48.0,136.4,243.4,396.7,519.9,602.8))
x are time points and y is the number of people infected per 10,000 people
Now, if I wanted to fit a four-parameter logistic model in nls, I could use
n1 <- nls(y ~ SSfpl(x, A, B, M, S), data = dat)
But now imagine that parameter estimation is highly sensitive to the initial values so I want to optimize my approach. How would this be achieved?
The way I have thought to try is as follows
fun_to_optim <- function(data, guess){
x = data$x
y = data$y
A = guess[1]
B = guess[2]
M = guess[3]
S = guess[4]
y = A + (B-A)/(1+exp((M-x)/S))
return(-sum(y)) }
optim(fn=fun_to_optim, data=dat,
par=c(10,10,10,10),
method="Nelder-Mead")
The result from optim() is wrong but I cannot see my error. Thank you for any assistance.
The main issue is that you're not computing/returning the sum of squares from your objective function. However: I think you really have it backwards. Using nls() with SSfpl is about the best you're going to do in terms of optimization: it has sensible heuristics for picking starting values (SS stands for "self-starting"), and it provides a gradient function for the optimizer. It's not impossible that, with a considerable amount of work, you could find better heuristics for picking starting values for a particular system, but in general switching from nls to optim + Nelder-Mead will leave you worse off than when you started (illustration below).
fun_to_optim <- function(data, guess){
x = data$x
y = data$y
A = guess[1]
B = guess[2]
M = guess[3]
S = guess[4]
y_pred = A + (B-A)/(1+exp((M-x)/S))
return(sum((y-y_pred)^2))
}
Fit optim() with (1) your suggested starting values; (2) better starting values that are somewhere nearer the correct values (you could get most of these values by knowing the geometry of the function — e.g. A is the left asymptote, B is the right asymptote, M is the midpoint, S is the scale); (3) same as #2 but using BFGS rather than Nelder-Mead.
opt1 <- optim(fn=fun_to_optim, data=dat,
par=c(A=10,B=10,M=10,S=10),
method="Nelder-Mead")
opt2 <- optim(fn=fun_to_optim, data=dat,
par=c(A=10,B=500,M=10,S=1),
method = "Nelder-Mead")
opt3 <- optim(fn=fun_to_optim, data=dat,
par=c(A=10,B=500,M=10,S=1),
method = "BFGS")
Results:
xvec <- seq(1,10,length=101)
plot(y~x, data=dat)
lines(xvec, predict(n1, newdata=data.frame(x=xvec)))
p1 <- with(as.list(opt1$par), A + (B-A)/(1+exp((M-xvec)/S)))
lines(xvec, p1, col=2)
p2 <- with(as.list(opt2$par), A + (B-A)/(1+exp((M-xvec)/S)))
lines(xvec, p2, col=4)
p3 <- with(as.list(opt3$par), A + (B-A)/(1+exp((M-xvec)/S)))
lines(xvec, p3, col=6)
legend("topleft", col=c(1,2,4,6), lty=1,
legend=c("nls","NM (bad start)", "NM", "BFGS"))
nls and good starting values + BFGS overlap, and provide a good fit
optim/Nelder-Mead from bad starting values is absolutely terrible — converges on a constant line
optim/N-M from good starting values gets a reasonable fit, but obviously worse; I haven't analyzed why it gets stuck there.

LaplacesDemon: when should I sum prior density?

I am migrating from JAGS to LaplacesDemon and trying to rewrite some of my codes. I have read the LaplacesDemon Tutorial and LaplacesDemon Examples vignettes and am a bit confused about some of examples in the vignettes.
In the simple example in LaplacesDemon Tutorial (p.5), the model is written as:
Model <- function(parm, Data)
{beta <- parm[Data$pos.beta]
sigma <- interval(parm[Data$pos.sigma], 1e-100, Inf)
parm[Data$pos.sigma] <- sigma
beta.prior <- dnormv(beta, 0, 1000, log=TRUE)
sigma.prior <- dhalfcauchy(sigma, 25, log=TRUE)
mu <- tcrossprod(beta, Data$X)
LL <- sum(dnorm(Data$y, mu, sigma, log=TRUE))
LP <- LL + sum(beta.prior) + sigma.prior
Modelout <- list(LP=LP, Dev=-2*LL, Monitor=LP,
yhat=rnorm(length(mu), mu, sigma), parm=parm)
return(Modelout)}
Here, the beta.prior was summed up for LP as there are more than one beta parameters.
But I found in the more advanced examples in the LaplacesDemon Example vignette, it doesn't seem to always follow the rule. Such as in example 87 (p.162):
Model <- function(parm, Data)
{### Log-Prior
beta.prior <- sum(dnormv(beta[,1], 0, 1000, log=TRUE), dnorm(beta[,-1], beta[,-Data$T], matrix(tau, Data$K, Data$T-1), log=TRUE))
zeta.prior <- dmvn(zeta, rep(0,Data$S), Sigma[ , , 1], log=TRUE)
phi.prior <- sum(dhalfnorm(phi[1], sqrt(1000), log=TRUE), dtrunc(phi[-1], "norm", a=0, b=Inf, mean=phi[-Data$T], sd=sigma[2], log=TRUE))
### Log-Posterior
LP <- LL + beta.prior + zeta.prior + sum(phi.prior) + sum(kappa.prior) + sum(lambda.prior) + sigma.prior + tau.prior
Modelout <- list(LP=LP, Dev=-2*LL, Monitor=LP, yhat=rnorm(prod(dim(mu)), mu, sigma[1]), parm=parm)
return(Modelout)}
(Put only part of the codes owing to the length of the example codes)
Here, zeta is more than one but wasn't summed in either the Log-Prior or Log-Posterior part, beta is more than one and was summed in Log-Prior and phi is also more than one parameters but it was summed in both Log-Prior and Log-Posterior parts.
And in the next example on p.167, it seems to be different again.
I was wondering in what scenario we should sum the prior density? Many thanks!
Have you tried running the code line by line? You would learn that there is nothing to sum since dmvn is the density function of multivariate normal distribution and it returns a single value -- probability density of observing vector zeta. The reason for all the sums is that to obtain probability of observing two independent events together we multiply their marginal probabilities (or sum their logs). So we multiply the probabilities of observing all the priors together to obtain their joint distribution.

Finding Sample Size

I am attempting to use several methods (Wald, Wilson, Clopper-Pearson, Jeffreys, etc.) to calculate sample sizes for confidence intervals. I have been unable to find, in R, how to calculate these. Is there a better way to calculate these besides brute force? Does R have a package that will output all to compare?
I have been unsuccessful with the likes of n.clopper.pearson{GenBinomApps} and some of these require lots of by-hand computations. I have done this for the Wald method:
#Variables
z <- 1.95996
d <- .05
p <- 0.5
q <- 1 - p
#Wald
n_wald <- (z^2 * (p*q))/(d^2)
n_wald
But, I have not been able to find away, besides guess and check methods, to produce the others in R.
I was able to answer my own question with help from the comments:
n_wald <- ciss.wald(p, d, alpha = 0.05)
n_wilson <- ciss.wilson(p, d, alpha = 0.05)
n_agricoull <- ciss.agresticoull(p, d, alpha = 0.05)
These were from the binomSamSize package. Still struggling with an optimization for the clopper-pearson and jeffries if anyone can provide direction there, but these commands calculated sample size easily.

How to estimate lambdas of poisson distributed samples in R and to draw Kernel estimation of the density function of the estimator basing on that?

So I have 500 poisson distributed simulated samples with n=100 each.
1) How can I estimate the lambdas for each of these samples separately in R ?
2) How can I draw Kernel Estimation of the density function of the estimator for lambda based on the 500 estimated lambdas? (my guess is somehow with "Kernsmooth" package and function "bkfe" but i fail to programm it normally anyway
taskpois <- function(size, leng){
+ taskmlepois <- NULL
+ for (i in 1:leng){
+ randompois <- rpois(size, 6)
+ taskmlepois[i] <- mean(randompois)
+ }
+ return(taskmlepois)
+ }
tasksample <- taskpois(size=100, leng=500)
As the comments suggest, it seems you're pretty close already.
ltarget <- 2
set.seed(101)
lambdavec <- replicate(500,mean(rpois(100,lambda=ltarget)))
dd <- density(lambdavec)
plot(dd,main="",las=1,bty="l")
We might as well add the expected result based on asymptotic theory:
curve(dnorm(x,mean=2,sd=sqrt(2/100)),add=TRUE,col=2)
We can add another line that shows that the variation among the densities of different experiments is pretty large relative to the difference between the theoretical and observed density from the first experiment:
lambdavec2 <- replicate(500,mean(rpois(100,lambda=ltarget)))
lines(density(lambdavec2),col=4)

R obtaining a probability distribution

I have a relationship:
y = a + b + c
I have the average and standard deviation of a, b and c
and I would like to obtain the probability distribution of y
from this by Monte Carlo simulation.
Is there a function or package or easy way that I can use to do this?
I assume that your are assuming your inputs a,b and c are normally distributed because you say you can define them with mean and standard deviation. If that is the case, you can do this pretty fast without any special package.
mu.a=33
mu.b=32
mu.c=13
sigma.a=22
sigma.b=22
sigma.c=222
n= a.large.number=10^5
a=rnorm(n,mu.a,sigma.a)
b=rnorm(n,mu.b,sigma.b)
c=rnorm(n,mu.c,sigma.c)
y=a+b+c
plot(density(y))
mean(y)
sd(y)
Make sure to be aware of all the assumptions we are making about y,a,b and c.
If you want to do something more complex like figure out the sampling variance of the mean of y. Then do this procedure many times collecting the mean and plot it.
mysimfun=function(n,mu,sigma,stat.you.want='mean')
# mu is length 3 and sigma is too.
{
n= a.large.number=10^5
a=rnorm(n,mu[1],sigma[1])
b=rnorm(n,mu[2],sigma[2])
c=rnorm(n,mu[3],sigma[3])
y=a+b+c
plot(density(y))
return(ifelse(stat.you.want=='mean',mean(y),sd(y))
}
mu=c(mu.a,my.b,mu.c)
sigma=c(sigma.a,sigma.b,sigma.c)
mi=rep(NA,100)
Then run it in a loop of some sort.
for(i in 1:100) {mi[i]=mysimfun(10,mu,sigma,stat.you.want='mean') }
par(mfrow=c(2,1)
hist(mi)
plot(density(mi))
mean(mi)
sd(mi)
There would be two approaches: bootstrapping which I think is what you might mean by MonteCarlo or if you are more interested in the theory than constructing estimates from empiric distributions, the 'distr' package and its friends 'distrSim" and "distrTEst".
require(boot)
ax <- rnorm(100); bx<-runif(100); cx<- rexp(100)
dat <- data.frame(ax=ax,bx=bx,cx=cx)
boot(dat, function(d){ with(d, mean(ax+bx+cx) )}, R=1000, sim="parametric")
boot(dat, function(d){ with(d, sd(ax+bx+cx) )}, R=1000, sim="parametric")

Resources