I have written a function for performing maximum simulated likelihood estimation in R, which works quite well.
However, the problem is that optim does not call the same function for estimating the likelihood value and estimating the gradient at the same time, like the fminuc optimizer in matlab does. Thus every time if optim wants to update the gradient, the simulation for the given parameter vector have to repeated. At the end optim has called about 100 times the loglik function for updating the parameters and in addition 50 times the loglik function for calculating the gradient.
I am wondering if there is an elegant solution to avoid the 50 additional simulation steps, for example by storing the estimated likelihood value and gradient in each step. Then before the likelihood function is called the next time, it is checked if for a given parameter vector the information are available already or not. This could be done by interpose an additional function between optim and the loglik function. But that seems to be bitty.
Any good ideas?
Cheers,
Ben
Related
I use nlm to maximize a likelihood in R. I would like to predict the number of likelihood evaluations and abort if the task is likely to take too long. nlm returns the number of 'iterations' (typically 10-20), and I take it that each iteration involves one numerical evaluation of the Hessian. Time for each iteration (Hessian?) depends on the number of parameters. So I'd like to know: What is the general relationship between the number of parameters and the number of function evaluations per iteration in nlm?
The question is very general, and so is my answer.
From the reference of nlm:
If the function value has an attribute called gradient or both
gradient and hessian attributes, these will be used in the calculation
of updated parameter values. Otherwise, numerical derivatives are
used. deriv returns a function with suitable gradient attribute and
optionally a hessian attribute.
If you provide the gradient and the Hessian for a function being minimized, each iteration involves two function evaluations. If you don't, the Hessian and the gradient are calculated numerically. The source code can be found here. As far as I understand, the parameters of the R nlm function only affect the number of iterations until convergence, but have no influence on the way gradients are evaluated numerically.
I would like to perform a nonlinear transformation of a regression coefficient. For example:
, or
.
Stata has a convenient implementation with nlcom this that employs the delta method to estimate standard errors and corresponding confidence intervals. I understand a simple transformation as posted can be simply done by directly addressing the coefficient of interest from the model. However, if we are interested in the ratio of several linear and nonlinear combinations, what would be an efficient method to produce confidence bounds on a transformation such as this? Moreover, when coefficients have a full co-variance matrix with standard errors estimated along with them.
To answer my own question, I discovered the library(msm) package that accommodates my request nicely with the function deltamethod(). UCLA has a really nice write up of this method, so I am providing the link for anyone who might have a similar need.
Using the delta method for nonlinear transformations of regression coefficients.
The deltaMethod() function from package car also accomplishes the same, providing as its output the estimate, its standard error and 95% confidence interval.
I am doing a model fitting using minpack.lm package. The objective function is the residual between my experimental data & my ODE model.
The tricky thing is the initial value for my ODE model. So I use a loop to run randomly initial values and keep the best result which minimizes the objective function.
The problem is that if the random initial value is bad, my model can't solve the ODE equation, the result I get is either NAN or errors such as the problem not converged or number of time steps 1 exceeded maxsteps at t = 0. So my questions are:
Is there any way to stop when the initial value is bad and pass to the next initial value in one loop?
Do you have any advice to choose a best initial value than run randomly?
Thanks a lot
I am using lmRob.
require(robust)
stack.rob.int <- lmRob(Loss ~.*., data = stack.dat)
Fine but, I was wondering how I could obtain the psi-function that is used by the lmRob function in the actual fitting. Thanks in advance for any help!
If I were to use the lmrob function in robustbase, is it possible to change the psi function to subtract it by a constant. I am trying to implement the bootstrap as per Lahiri (Annals of Statistics, 1992) where the way to still keep the bootstrap valid is mentioned to be to replace the psi() with the originalpsi() minus the mean ot the residuals while fitting the bootstrap for the robust linear model.
So, there is no way to access the psi function directly for robust::lmRob().
Simply put, lmRob() calls lmRob.fit() (or lmRob.wfit() if you supply weights) which subsequently calls lmRob.fit.compute() that then sets initial values for a Fortran version depending on the lmRob.control() set to either "bisquare" or "optimal".
As a result of the above discussion, if you need access to the psi functions, you may wish to use robustbase as it has easy access to many psi functions (c.f. the biweights)
Edit 1
Regarding:
psi function evaluated at the residuals in lmRob
No. The details of what is available after running lmRob is available in the lmRob.object. The documentation is accessible via ?lmRob.object. Regarding residuals, the following are available in the lmRob object.
residuals: the residual vector corresponding to the estimates returned in coefficients.
T.residuals: the residual vector corresponding to the estimates returned in T.coefficients.
M.weights: the robust estimate weights corresponding to the final MM-estimates in coefficients, if applies.
T.M.weights: the robust estimate weights corresponding to the initial S-estimates in T.coefficients, if applies.
Regarding
what does "optimal" do in lmRob?
Optimal refers to the following psi function:
sign(x)*(- (phi'(|x|) + c) / (phi(|x|) )
For other traditional psi functions, you may wish to look at robustbase's vignette
or a robust textbook.
I am currently converting a matlab set of models that calculate the log likelihood using the optimizer fminunc to R using optim 'BFGS'.
I have the initial values, the maximum likelihood values and end parameter results for all the matlab models. Most of the R converted models can find using optim, the same log likelihood and the same end parameter values using the same initial parameters as matlab. However some get stuck at a local optima which can be fixed by putting in the matlab end parameter values as the initial values, and these models then find the matlab maximum likelihood values.
Is there a more powerful optimization for R, that in on par with matlab's or is it just that R is more likely to get stuck at a local optima and therefore intial parameter values become more critical in them achieving the maximum log liklihood that getting stuck at local optim ?
res<-optim(par=x,fn=BOTH4classnochange,hessian=TRUE,method='BFGS',control=list(maxit=MaxIter,abstol=TolFun,reltol=TolX))
Are you aware of the CRAN Task View on Optimization listing every related package? And yes, there are dozens.