Cross validation on fitted survival objects? - r

I can see how cv.glm work with a glm object, but what about fitted survival models?
I have a bunch of models (Weibull, Gompertz, lognormal, etc). I want to assess the prediction error using cross validation. Which package/function can do this in R?

SuperLearner can do V-fold cross-validation for a large library of underlying machine learning algorithms, not sure that it includes survival models. Alternatively, take a look at the cvTools package, which is designed to help do cross-validation of any prediction algorithm you give it.

Related

Validation for multivariate autoregressive model (MAR) with MAR1 package

I'm trying to validation test or to test reliability of multivariate autoregressive model estimated by MAR1 package.
As far as I understand, there is no such function in this package.
As one of the solution, I tried to use "plot(model,. plot/.type=model.resids.ytT)", which is introduced in users guideline of MARSS package, to confirm whether the model has convergence problems.
However, the output plot was the same as the plot of coefficients obtained by the function "plot(model$top.benefit)".
We would appreciate it if you could tell us the best way to do this.
Sincerely,

SVM for data prediction in R

I'd like to use the 'e1071' library for fitting an SVM model. So far, I've made a model that creates a curve regression based on the data set.
(take a look at the purple curve):
However, I want the SVM model to "follow" the data, such that the prediction for each value is as close as possible to the actual data. I think this is possible because of this graph that shows how SVMs (model 2) model are similar to an ARIMA model (model 1):
I tried changing the kernel to no avail. Any help will be much appreciated.
Fine tuning a SVM classifier is no easy task. Have you considered other models? For ex. GAM's (generalized additive models)? These work well on very curvy data.

MLR MARS/Earth classifier: flexible discriminant analysis or logistic regression?

I'm trying to learn about MARS/Earth models for classification and am using "classif.earth" in the MLR package in R. My issue is that the MLR documentation says that "classif.earth" performs flexible discriminant analysis using the earth algorithm.
However, when I look at the code:
(https://github.com/mlr-org/mlr/blob/master/R/RLearner_classif_earth.R)
I don't see a call to fda in the mda package, rather it directs earth to fit a glm with a default logit link.
So tell me if I'm wrong, but it seems to me that "classif.earth" is not doing flexible discriminant analysis but rather fitting a logistic regression on the earth model.
The implementation uses MARS to perform the FDA, where the MARS model determines the different groups. You can find more information in this paper; I quote from the abstract:
Linear discriminant analysis is equivalent to multiresponse linear regression [...] to represent the groups.

R. How to boost the SVM model

I have made SVM model using SVM package in R for a classification problem. I got only 87% accuracy. But random forest produces around 92.4%.
fit.svm<-svm(modelformula, data=training, gamma = 0.01, cost = 1,cross=5)
Would like to use boosting for tuning this SVM model. Can someone will help me to tune this SVM model?
What are the best parameters I can provide for SVM method?
Example for booting for SVM model.
To answer your first question.
The e1071 library in R has a built-in tune() function to perform CV. This will help you select the optimal parameters cost, gamma, kernel. You can also manipulate a SVM in R with the package kernlab. You may get different results from the 2 libraries. Let me know if you need any examples.
You may want to look into the caret package. It allows you to both pick various kernels for SVM (model list) and also run parameter sweeps to find the best model.

prediction intervals with caret

I've been using the caret package in R to run some boosted regression tree and random forest models and am hoping to generate prediction intervals for a set of new cases using the inbuilt cross-validation routine.
The trainControl function allows you to save the hold-out predictions at each of the n-folds, but I'm wondering whether unknown cases can also be predicted at each fold using the built-in functions, or whether I need to use a separate loop to build the models n-times.
Any advice much appreciated
Check the R package quantregForest, available at CRAN. It can easily calculate prediction intervals for random forest models. There's a nice paper by the author of the package, explaining the backgrounds of the method. (Sorry, I can't say anything about prediction intervals for BRT models; I'm looking for them by myself...)

Resources