How to run model diagnostics and validate binomial GAMs? - r

I'm looking for methods to test the overall fit of a model, run model diagnostics to help with model selection and methods for model validation for binomial GAMs.
If knows of any way to do use this using R that would be extremely helpful as well (i.e packages and functions). I have heard of DHARMa, but am at a loss of how I would use the package.
Any links with more information would also be appreciated.
Currently, all I have been able to do is ROC curves and AUC values.
Thanks

Related

Validation for multivariate autoregressive model (MAR) with MAR1 package

I'm trying to validation test or to test reliability of multivariate autoregressive model estimated by MAR1 package.
As far as I understand, there is no such function in this package.
As one of the solution, I tried to use "plot(model,. plot/.type=model.resids.ytT)", which is introduced in users guideline of MARSS package, to confirm whether the model has convergence problems.
However, the output plot was the same as the plot of coefficients obtained by the function "plot(model$top.benefit)".
We would appreciate it if you could tell us the best way to do this.
Sincerely,

Obtaining glmer coefficient confidence intervals via bootstrapping

I am in my first experience using mixed models in R for my statistical analysis. Due to my data being comprised of binary outcome variables, I have managed to build a logistic model using the glmer function of the lme4 package that I think works as I wanted it to.
I am now aiming to investigate the statistical significance of my model coefficients. I have read that generally, the best approach for generalized mixed models is to bootstrap confidence intervals, but I haven't managed to find a good, clear, explanation of how to do this in R.
Would anyone have any suggestions? Are there any packages in R that expedite this process, or do people generally build their own functions for this? I haven't really done any bootstrapping before so I'd appreciate some more in-depth answers.
If you want to compute parametric bootstrap confidence intervals, the built-in functionality
confint(fitted_model, method = "boot")
should work (see ?confint.merMod)
Also see this answer (which illustrates both parametric and nonparametric bootstrapping for user-defined quantities).
If you have multiple cores, you can speed this up by adding parallel = "multicore", ncpus = parallel::detectCores()-1 (or some other appropriate number of cores to use): see ?lme4::bootMer for details.

SVM for data prediction in R

I'd like to use the 'e1071' library for fitting an SVM model. So far, I've made a model that creates a curve regression based on the data set.
(take a look at the purple curve):
However, I want the SVM model to "follow" the data, such that the prediction for each value is as close as possible to the actual data. I think this is possible because of this graph that shows how SVMs (model 2) model are similar to an ARIMA model (model 1):
I tried changing the kernel to no avail. Any help will be much appreciated.
Fine tuning a SVM classifier is no easy task. Have you considered other models? For ex. GAM's (generalized additive models)? These work well on very curvy data.

Cross validation on fitted survival objects?

I can see how cv.glm work with a glm object, but what about fitted survival models?
I have a bunch of models (Weibull, Gompertz, lognormal, etc). I want to assess the prediction error using cross validation. Which package/function can do this in R?
SuperLearner can do V-fold cross-validation for a large library of underlying machine learning algorithms, not sure that it includes survival models. Alternatively, take a look at the cvTools package, which is designed to help do cross-validation of any prediction algorithm you give it.

prediction intervals with caret

I've been using the caret package in R to run some boosted regression tree and random forest models and am hoping to generate prediction intervals for a set of new cases using the inbuilt cross-validation routine.
The trainControl function allows you to save the hold-out predictions at each of the n-folds, but I'm wondering whether unknown cases can also be predicted at each fold using the built-in functions, or whether I need to use a separate loop to build the models n-times.
Any advice much appreciated
Check the R package quantregForest, available at CRAN. It can easily calculate prediction intervals for random forest models. There's a nice paper by the author of the package, explaining the backgrounds of the method. (Sorry, I can't say anything about prediction intervals for BRT models; I'm looking for them by myself...)

Resources