No longer can use vglm for underdispersed count data in R? - r

I've read elsewhere that the VGAM package can be used to model underdispersed count data via the genpoisson families. However, when I look up the help file for genpoisson0, genpoisson1, and genpoisson2 they all say the following:
"In theory the λ parameter is allowed to be negative to handle underdispersion, however this is no longer supported, hence 0 < λ < 1."
"In theory the \varphi parameter might be allowed to be less than unity to handle underdispersion but this is not supported."
"In theory the α parameter might be allowed to be negative to handle underdispersion but this is not supported."
Where can I go to handle underdispersion?

You can use quasilikelihood methods, e.g. family=quasipoisson in glm() (in base R)
the glmmTMB package supports COM-Poisson (family = compois) and generalized Poisson (family = genpois) conditional distributions.
It's not clear to me whether the reasons discussed (briefly) here for why underdispersed generalized Poisson distributions are no longer supported in VGAM also apply to the implementation in glmmTMB ...
There is some discussion of the glmmTMB parameterizations/implementations of COM-Poisson and generalized Poisson in Brooks et al (2019).
Brooks, Mollie E., Kasper Kristensen, Maria Rosa Darrigo, Paulo Rubim, María Uriarte, Emilio Bruna, and Benjamin M. Bolker. “Statistical Modeling of Patterns in Annual Reproductive Rates.” Ecology 100, no. 7 (2019): e02706.


mlr3 optimized average of ensemble

I try to optimize the averaged prediction of two logistic regressions in a classification task using a superlearner.
My measure of interest is classif.auc
The mlr3 help file tells me (?mlr_learners_avg)
Predictions are averaged using weights (in order of appearance in the
data) which are optimized using nonlinear optimization from the
package "nloptr" for a measure provided in measure (defaults to
classif.acc for LearnerClassifAvg and regr.mse for LearnerRegrAvg).
Learned weights can be obtained from $model. Using non-linear
optimization is implemented in the SuperLearner R package. For a more
detailed analysis the reader is referred to LeDell (2015).
I have two questions regarding this information:
When I look at the source code I think LearnerClassifAvg$new() defaults to "classif.ce", is that true?
I think I could set it to classif.auc with param_set$values <- list(measure="classif.auc",optimizer="nloptr",log_level="warn")
The help file refers to the SuperLearner package and LeDell 2015. As I understand it correctly, the proposed "AUC-Maximizing Ensembles through Metalearning" solution from the paper above is, however, not impelemented in mlr3? Or do I miss something? Could this solution be applied in mlr3? In the mlr3 book I found a paragraph regarding calling an external optimization function, would that be possible for SuperLearner?
As far as I understand it, LeDell2015 proposes and evaluate a general strategy that optimizes AUC as a black-box function by learning optimal weights. They do not really propose a best strategy or any concrete defaults so I looked into the defaults of the SuperLearner package's AUC optimization strategy.
Assuming I understood the paper correctly:
The LearnerClassifAvg basically implements what is proposed in LeDell2015 namely, it optimizes the weights for any metric using non-linear optimization. LeDell2015 focus on the special case of optimizing AUC. As you rightly pointed out, by setting the measure to "classif.auc" you get a meta-learner that optimizes AUC. The default with respect to which optimization routine is used deviates between mlr3pipelines and the SuperLearner package, where we use NLOPT_LN_COBYLA and SuperLearner ... uses the Nelder-Mead method via the optim function to minimize rank loss (from the documentation).
So in order to get exactly the same behaviour, you would need to implement a Nelder-Mead bbotk::Optimizer similar to here that simply wraps stats::optim with method Nelder-Mead and carefully compare settings and stopping criteria. I am fairly confident that NLOPT_LN_COBYLA delivers somewhat comparable results, LeDell2015 has a comparison of the different optimizers for further reference.
Thanks for spotting the error in the documentation. I agree, that the description is a little unclear and I will try to improve this!

How are MESS maps calculated in the dismo package?

Elith et al. [1] describe a method of measuring the dissimilarity between the values used in model fitting and the values used in making predictions. In the context of species distribution modelling (ecological niche modelling) that prediction is a 'projection'. The method is called 'multivariate environmental similarity surface (MESS) analysis. There is a function in the dismo package to estimate it (as well as a function built into the MAXENT java program).
q1: Does anyone know what units are reported by the dismo::mess function?
The dismo::mess function reports not only a MESS for each predictor (received and reported as a raster), but also reports a layer named 'rmess'. In the help file it is described as " an additional layer with the MESS values".
q2: How are the MESS values calculated?
q3: What is the rmess layer a measure of?
Thanks for your help!
[1] Elith, J., Kearney, M. & Phillips, S. 2010 The art of modelling range-shifting species. Methods in Ecology and Evolution 1, 330-342. (doi:10.1111/j.2041-210X.2010.00036.x).
You can see what dismo does by typing
It calls .messi3, which you can see with
(I found the answer in the appendices...)

DOI in CRAN submission of R package?

After submitting an R package to CRAN, I received one of the following suggestions:
"Is there some reference about the method you can add in the Description field in the form Authors (year) ?"
After doing some searching, I haven't really found any instances of people putting DOIs in the Description file, except perhaps in the CITATION file, but that is not what is asked for here it seems. May I ask how I might go about resolving this issue? Thanks in advance!
Your searching may have been superficial. Limiting it to the subset of what I may have installed here so that I can grep:
edd#rob:~$ grep -l "<doi:.*>" /usr/local/lib/R/site-library/*/DESCRIPTION
And, just to plain, here are the first ten lines of the actual result set:
edd#rob:~$ grep -h "<doi:.*>" /usr/local/lib/R/site-library/*/DESCRIPTION | head -10
80:580-598. <doi:10.1080/01621459.1985.10478157>].
<doi:10.1080/01621459.1988.10478610>]. A good introduction to these two methods is in chapter 16 of
See Christian Borgelt (2012) <doi:10.1002/widm.1074>.
Description: Contains procedures for depth-based supervised learning, which are entirely non-parametric, in particular the DDalpha-procedure (Lange, Mosler and Mozharovskyi, 2014 <doi:10.1007/s00362-012-0488-4>). The training data sample is transformed by a statistical depth function to a compact low-dimensional space, where the final classification is done. It also offers an extension to functional data and routines for calculating certain notions of statistical depth functions. 50 multivariate and 5 functional classification problems are included. (Pokotylo, Mozharovskyi and Dyckerhoff, 2019 <doi:10.18637/jss.v091.i05>).
Brest et al. (2006) <doi:10.1109/TEVC.2006.872133>.
Description: An R6 object oriented distributions package. Unified interface for 42 probability distributions and 11 kernels including functionality for multiple scientific types. Additionally functionality for composite distributions and numerical imputation. Design patterns including wrappers and decorators are described in Gamma et al. (1994, ISBN:0-201-63361-2). For quick reference of probability distributions including d/p/q/r functions and results we refer to McLaughlin, M. P. (2001). Additionally Devroye (1986, ISBN:0-387-96305-7) for sampling the Dirichlet distribution, Gentle (2009) <doi:10.1007/978-0-387-98144-4> for sampling the Multivariate Normal distribution and Michael et al. (1976) <doi:10.2307/2683801> for sampling the Wald distribution.
proposed by Marsaglia and Tsang (2000, <doi:10.18637/jss.v005.i08>).
Threefry engine (Salmon et al., 2011 <doi:10.1145/2063384.2063405>) as
Splines" <doi:10.1214/aos/1176347963>.

Which loss function is used in R package gbm for multinomial distribution?

I am using the R package gbm to fit probabilistic classifiers in a dataset with > 2 classes. I am using distribution = "multinomial" as argument, however, I have some difficulties to find out implementation details of what is actually implemented by that choice.
The help function for gbm states that
Currently available options are "gaussian" (squared error), "laplace" (absolute loss), "tdist" (t-distribution loss), "bernoulli" (logistic regression for 0-1 outcomes), "huberized" (huberized hinge loss for 0-1 outcomes), classes), "adaboost" (the AdaBoost exponential loss for 0-1 outcomes), "poisson" (count outcomes), "coxph" (right censored observations), "quantile", or "pairwise" (ranking measure using the LambdaMart algorithm).
and does not list multinomial, whereas the paragraph preceding the one I copied states that
... If not specified, gbm will try to guess: ... if the response is a factor,multinomial is assumed; ...
I would like to know which loss function is implemented if I specify distribution = "multinomial". The documentation in the vignette which can be accessed via
does not contain the word "multinomial" or any descriptions of what that argument implies.
I have tried to look at the package source code, but can't find the information there as well. It seems that the relevant things happen in the C++ functions in the file /src/multinomial.cpp, however, my knowledge of C++ is too limited to understand what is going on there.

Algorithm name in nlminb's PORT routines? [duplicate]

This question already has an answer here:
What is the closest function to R's nlminb in Python?
(1 answer)
Closed 4 years ago.
I'm using gnls function of nlme package to fit a curve. When I try to know what optimizer it was using, I was directed to nlminb function documentation and it states:
Unconstrained and box-constrained optimization using PORT routines.
I don't know what is "PORT routines". Is it a series of optimization algorithms or it's just an optimization algorithm called "PORT routines"?
Can anyone please at least tell me some names in the "routines". For example, "gradient descent", "Levenberg–Marquardt", or "trust region"?
Thanks in advance!!
nlminb is an unconstrained and bounds-constrained quasi-Newton method optimizer. This code is based on a FORTRAN PORT library by David Gay in the Bell Labs designed to be portable over different types of computer (from comments by Erwin Kalvelagen). See more here (Section 2).
L-BFGS-B & BFGS, being a member of quasi-Newton family methods, are the closest analogs of nlminb "adaptive nonlinear least-squares algorithm".
You can see the original report at An Adaptive Nonlinear Least-Squares Algorithm by J.E. Dennis, Jr., David M. Gay, Roy E. Welsch (thanks to Ben Bolker's comment).
