I am currently fitting a penalized logistic regression model using the package logistf (due to quasi-complete separation).
I chose this package over brglm because I found much more recommendations for logistf. However, the brglm seems to integrate better with other functions such as predict() or margins::margins(). In the documentation of brglm it says:
"Implementations of the bias-reduction method for logistic regressions can also be found in thelogistf package. In addition to the obvious advantage ofbrglmin the range of link functions that can be used ("logit","probit","cloglog"and"cauchit"), brglm is also more efficient computationally."
Has anyone experience with those two packages and can tell me whether I am overlooking a weakness in brglm, or can I just use it instead of logistf?
I'd be grateful for any insights!
Related
I try to optimize the averaged prediction of two logistic regressions in a classification task using a superlearner.
My measure of interest is classif.auc
The mlr3 help file tells me (?mlr_learners_avg)
Predictions are averaged using weights (in order of appearance in the
data) which are optimized using nonlinear optimization from the
package "nloptr" for a measure provided in measure (defaults to
classif.acc for LearnerClassifAvg and regr.mse for LearnerRegrAvg).
Learned weights can be obtained from $model. Using non-linear
optimization is implemented in the SuperLearner R package. For a more
detailed analysis the reader is referred to LeDell (2015).
I have two questions regarding this information:
When I look at the source code I think LearnerClassifAvg$new() defaults to "classif.ce", is that true?
I think I could set it to classif.auc with param_set$values <- list(measure="classif.auc",optimizer="nloptr",log_level="warn")
The help file refers to the SuperLearner package and LeDell 2015. As I understand it correctly, the proposed "AUC-Maximizing Ensembles through Metalearning" solution from the paper above is, however, not impelemented in mlr3? Or do I miss something? Could this solution be applied in mlr3? In the mlr3 book I found a paragraph regarding calling an external optimization function, would that be possible for SuperLearner?
As far as I understand it, LeDell2015 proposes and evaluate a general strategy that optimizes AUC as a black-box function by learning optimal weights. They do not really propose a best strategy or any concrete defaults so I looked into the defaults of the SuperLearner package's AUC optimization strategy.
Assuming I understood the paper correctly:
The LearnerClassifAvg basically implements what is proposed in LeDell2015 namely, it optimizes the weights for any metric using non-linear optimization. LeDell2015 focus on the special case of optimizing AUC. As you rightly pointed out, by setting the measure to "classif.auc" you get a meta-learner that optimizes AUC. The default with respect to which optimization routine is used deviates between mlr3pipelines and the SuperLearner package, where we use NLOPT_LN_COBYLA and SuperLearner ... uses the Nelder-Mead method via the optim function to minimize rank loss (from the documentation).
So in order to get exactly the same behaviour, you would need to implement a Nelder-Mead bbotk::Optimizer similar to here that simply wraps stats::optim with method Nelder-Mead and carefully compare settings and stopping criteria. I am fairly confident that NLOPT_LN_COBYLA delivers somewhat comparable results, LeDell2015 has a comparison of the different optimizers for further reference.
Thanks for spotting the error in the documentation. I agree, that the description is a little unclear and I will try to improve this!
I am new to Julia and i estimated some multilevel regressions using Mixed Models. Everything worked perfectly fine but i would like to estimate the marginal means or marginal effects. In R there are two packages that i am aware of for that regard: emmeans and ggeffects. Are there similar packages in Julia?
In Julia, there is now Effects.jl, which uses the same technique as the effects package in R (which is what ggeffects uses for its computation).
Since your question is tagged mixed-models, you might also consider JellyMe4 which adds support for lme4/MixedModels to RCall.
Don't believe there is a good package for this at the moment, though you could use RCall.jl and process your data there. Or, if you don't mind doing it manually - you could possibly calculate it from the predict() method from GLM.jl
I'm currently using the e1071 package in R to forecast product demand using support vector regression via the svm function in the package. While support vector regression yields much higher forecast accuracy for my data compared to other methods (e.g. ARIMA, simple exponential smoothing), my results show that the svm function tends to underforecast. In my particular case, underforecasting is worse and much more expensive than overforecasting. Therefore, I want to implement something in R to tells support vector regression to penalize underforecasting much more than overforecasting.
Unfortunately, I can't really find any possibility to do this. There seems to be nothing on this in the e1071 package. The kernlab package has a support vector function (ksvm) that implements an 'eps-bsvr bound-constraint svm regression' but I can't find any information what is meant by bound-constraint or how to define that bound.
Has anyone seen any examples how to do this in R? I'm only finding very mathematical papers on asymmetric loss functions for support vector regression, and I don't have the skills to translate this into R code, so i'm looking for an already existing solution in R.
Is there an R-Package I could use for Bayesian parameter estimation as an alternative to JAGS? I found an old question regarding JAGS/BUGS alternatives in R, however, the last post is already 9 years old. So maybe there are new and flexible gibbs sampling packages available in R? I want to use it to get parameter estimates for novel hierarchical hidden markov models with random effects and covariates etc. I highly value the flexibility of JAGS and think that JAGS is simply great, however, I want to write R functions that facilitate model specification and am looking for a package that I can use for parameter estimation.
There are some alternatives:
stan, with rstan R package. Stan looks well optimized but cannot do certain type of models (like binomial/poisson mixture model), since he cannot sample a discrete variable (or something like that...).
nimble
if you want highly optimized sampling based on C++, you may want to check Rcpp based solutions from Dirk Eddelbuettel
I wonder if anyone knows how to parallelize rfcv() function implemented in R-package 'randomForest'. Sorry if the question sounds very basic, but I tried to do this using 'foreach' without any results.
Have a look at the caret package and its documentation.
It not only is more general (allowing for more models than "just" random forests) but also integrates pre- and post-processing --- while also giving you parallel execution where feasible, particularly for evaluation and cross-validation which is an "embarassingly parallel" problem.