How does setting the opt argument for lmeControl change estimation? - r

I'm wondering if anyone knows how exactly setting the optimizer in lme() to opt='optim' changes parameter estimation.
As in this example:
ctrl <- lmeControl(opt='optim');
flow.lme <- lme(rate ~ nozzle, error= nozzle|operator, control=ctrl, data=Flow)
A related question was posed and answered here (https://stats.stackexchange.com/questions/40647/lme-error-iteration-limit-reached) but I don't have the reputation points to comment on it. : )

From ?lmeControl:
opt: the optimizer to be used, either ‘"nlminb"’ (the default) or
‘"optim"’.
optimMethod: character - the optimization method to be used with the
‘optim’ optimizer. The default is ‘"BFGS"’. An alternative
is ‘"L-BFGS-B"’.
As part of the estimation process, lme has to use a nonlinear optimization function to estimate the variance-covariance parameters. nlminb() and optim() are the two main built-in optimizers in R: while nlminb uses a single underlying algorithm, optim gives a choice of algorithms.
It's rather difficult to know a priori which nonlinear optimization function will work best on a particular set of data.

Related

binomial()$linkinv(fixef()) and binomial_pred_ci() functions: what exactly does these function are for when applied to mixed generalized analysis?

I was workinf on a dataset, trying re-perfomrming an already run statysical analysis and I met the following function:
binomial()$linkinv(fixef(m))
after running the following model
summary((m = glmer(T1.ACC ~ COND + (COND | ID), d9only, family = binomial)))
My first question is what exactly does this functions is made for? Beacuse throgh other command lines the reciprocal code as well as a slightly modified code based always on it are also reported:
1) 1- binomial()$linkinv(fixef())
2) d9only$fit = binomial()$linkinv(model.matrix(m) %*% fixef(m)) #also the sense of the operator %*% is quite misterious too.
Moreover, another function present is the following one:
binomial_pred_ci()
To be honest, I've to search through the overall script and no customized function there was or the package where that has been called from either? Anyone knows where does it may come from? Maybe the package 'runjags'? Just in case, any on how to download it?
Thanks for your answers
I agree with most of #Oliver's answer. I will add a few comments (since I had an answer partly composed already).
I would be very wary of the script you are following: some parts look wrong (I could obviously be mistaken since these bits are taken completely out of context ...)
binomial()$linkinv refers to the inverse link function for the model used. By default (which applies in this case since no optional link= argument has been specified), this is the inverse-logit or logistic function A nearly equivalent function is available via plogis(), but using $linkinv could be better in some cases since it would generalize to binomial analyses done with other link functions [e.g. probit or cloglog].
as #Oliver mentions, applying the inverse link function to the coefficients is at least weird, I would even say wrong. Researchers often exponentiate coefficients estimated on the logit/log-odds scale to obtain odds ratios, but applying the inverse link (usually logistic function) is rarely correct.
binomial()$linkinv(model.matrix(m) %*% fixef(m)) is indeed computing the predicted estimates on the link scale and converting them back to the data (= probability) scale. You can get the same results more reliably (handling missing values, etc.) by using predict(m, type = "response", re.form = ~0) (this extends #Oliver's answer to a case that also applies the inverse-link function for you).
I don't know what binomial_pred_ci is either, but I would suggest you look at predictInterval() from the merTools package ...
PS these answers all have not much to do with runjags, which uses an entirely different model structure. Presumably glmer models are being fitted for comparison ...
help(binomial) describes the link function and inverse link function and their uses. binomial()$linkinv is the binomial inverse-link function (sigmoid function) prob(y|eta) = 1 / (1 + exp(-eta)) where eta is the linear predictor. Using this with the coefficients (or fixed effects) is a bit odd, but is not unusual to get an idea of how large the effect of each coefficient is. I would not encourage it however.
%*% is the matrix multiplier, while model.matrix(m) (for lme4) extracts the fixed effect model matrix. So model.matrix(m) %*% fixef(m) is the linear predictor using only fixed effects. It would be the same as predict(m, re.form = ~ 0). This is often used in case you want to use the fixed effect model either because you want to correct for between-group-variation or because you are predicting new data.
binomial_pred_ci no idea. Guessing it's a function for predicting confidence levels.

mlr3 optimized average of ensemble

I try to optimize the averaged prediction of two logistic regressions in a classification task using a superlearner.
My measure of interest is classif.auc
The mlr3 help file tells me (?mlr_learners_avg)
Predictions are averaged using weights (in order of appearance in the
data) which are optimized using nonlinear optimization from the
package "nloptr" for a measure provided in measure (defaults to
classif.acc for LearnerClassifAvg and regr.mse for LearnerRegrAvg).
Learned weights can be obtained from $model. Using non-linear
optimization is implemented in the SuperLearner R package. For a more
detailed analysis the reader is referred to LeDell (2015).
I have two questions regarding this information:
When I look at the source code I think LearnerClassifAvg$new() defaults to "classif.ce", is that true?
I think I could set it to classif.auc with param_set$values <- list(measure="classif.auc",optimizer="nloptr",log_level="warn")
The help file refers to the SuperLearner package and LeDell 2015. As I understand it correctly, the proposed "AUC-Maximizing Ensembles through Metalearning" solution from the paper above is, however, not impelemented in mlr3? Or do I miss something? Could this solution be applied in mlr3? In the mlr3 book I found a paragraph regarding calling an external optimization function, would that be possible for SuperLearner?
As far as I understand it, LeDell2015 proposes and evaluate a general strategy that optimizes AUC as a black-box function by learning optimal weights. They do not really propose a best strategy or any concrete defaults so I looked into the defaults of the SuperLearner package's AUC optimization strategy.
Assuming I understood the paper correctly:
The LearnerClassifAvg basically implements what is proposed in LeDell2015 namely, it optimizes the weights for any metric using non-linear optimization. LeDell2015 focus on the special case of optimizing AUC. As you rightly pointed out, by setting the measure to "classif.auc" you get a meta-learner that optimizes AUC. The default with respect to which optimization routine is used deviates between mlr3pipelines and the SuperLearner package, where we use NLOPT_LN_COBYLA and SuperLearner ... uses the Nelder-Mead method via the optim function to minimize rank loss (from the documentation).
So in order to get exactly the same behaviour, you would need to implement a Nelder-Mead bbotk::Optimizer similar to here that simply wraps stats::optim with method Nelder-Mead and carefully compare settings and stopping criteria. I am fairly confident that NLOPT_LN_COBYLA delivers somewhat comparable results, LeDell2015 has a comparison of the different optimizers for further reference.
Thanks for spotting the error in the documentation. I agree, that the description is a little unclear and I will try to improve this!

Initial parameters in nonlinear regression in R

I want to learn how to do nonlinear regression in R. I managed to learn the basics of the nls function, but how we know it's crucial in nonlinear regression to use good initial parameters. I tried to figure out how selfStart and getInitial functions works but failed. The documentation is very scarce and not very usefull. I wanted to learn these functions via a simple simulation data. I simulated data from logistic model:
n<-100 #Number of observations
d<-10000 #our parameters
b<--2
e<-50
set.seed(n)
X<-rnorm(n, -e/b, 2) #Thanks to it we'll have many observations near the point where logistic function grows the faster
Y<-d/(1+exp(b*X+e))+rnorm(n, 0, 200) #I simulate data
Now I wanted to do regression with a function f(x)=d/(1+exp(b*x+e)) but I don't know how to use selfStart or getInitial. Could you help me? But please, don't tell me about SSlogis. I'm aware it's a functon destined to find initial parameters in logistic regression, but It seems it only works in regression with one explanatory variable and I'd like to learn how to do logistic regression with more than one explanatory variables and even how to do general nonlinear regression with a function that I defined mysefl.
I will be very gratefull for your help.
I don't know why the calculus of good initial parameters fails in R. The aim of my answer is to provide a method to find good enough initial parameters.
Note that a non-iterative method exists which doesn't requires initial parameters. The principle is explained in this paper, pp.37-46 : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales
A simplified version is shown below.
If the results are not sufficient, they can be used as initial parameters in an usual non-linear regression software such as in R.
A numerical example is shown below. Usually the number of points is much higher. Here it is deliberately low in order to make easier the checking when one edit the code and check it.

R Supervised Latent Dirichlet Allocation Package

I'm using this LDA package for R. Specifically I am trying to do supervised latent dirichlet allocation (slda). In the linked package, there's an slda.em function. However what confuses me is that it asks for alpha, eta and variance parameters. As far as I understand, I thought these parameters are unknowns in the model. So my question is, did the author of the package mean to say that these are initial guesses for the parameters? If yes, there doesn't seem to be a way of accessing them from the result of running slda.em.
Aside from coding the extra EM steps in the algorithm, is there a suggested way to guess reasonable values for these parameters?
Since you are trying to generate a supervised model, the typical approach would be to use cross validation to determine the model parameters. So you hold out some of the data as your test set, train the a model on the remaining data, and evaluate the model performance, repeating k times. You then continue to repeat with different model parameters to determine which result in the best model performance.
In the specific case of slda, I would run demo(slda) to see the author's implementation of it. When you run the demo, you'll see that he sets alpha=1.0, eta=0.1, and variance=0.25. I'd suggest using these as your starting point, and then use cross validation to determine better parameters if you need to improve model performance.

Optimization in R with arbitrary constraints

I have done it in Excel but need to run a proper simulation in R.
I need to minimize function F(x) (x is a vector) while having constraints that sum(x)=1, all values in x are [0,1] and another function G(x) > G_0.
I have tried it with optim and constrOptim. None of them give you this option.
The problem you are referring to is (presumably) a non-linear optimization with non-linear constraints. This is one of the most general optimization problems.
The package I have used for these purposes is called nloptr: see here. From my experience, it is both versatile and fast. You can specify both equality and inequality constaints by setting eval_g_eq and eval_g_ineq, correspondingly. If the jacobians are known explicitly (can be derived analytically), specify them for faster convergence; otherwise, a numerical approximation is used.
Use this list as a general reference to optimization problems.
Write the set of equations using the Lagrange multiplier, then solve using the R command nlm.
You can do this in the OpenMx Package (currently host at the site listed below. Aiming for 2.0 relase on cran this year)
It is a general purpose package mostly used for Structural Equation Modelling, but handling nonlinear constraints.
FOr your case, make an mxModel() with your algebras expressed in mxAlgebras() and the constraints in mxConstraints()
When you mxRun() the model, the algebras will be solved within the constraints, if possible.
http://openmx.psyc.virginia.edu/

Resources