I would like to impute my data using rfImpute() from randomForest CRAN package in R. However, I was wondering if it is also possible to optimize the hyperparameters 'niter' and 'ntree' and use the most optimal number for imputation on my data?
I saw that there is hyperparameter optimization for prediction and classification using randomforest, but is it also possible to do so for rfimpute()? :)
thanks in advance for any help,
Related
I'm trying to validation test or to test reliability of multivariate autoregressive model estimated by MAR1 package.
As far as I understand, there is no such function in this package.
As one of the solution, I tried to use "plot(model,. plot/.type=model.resids.ytT)", which is introduced in users guideline of MARSS package, to confirm whether the model has convergence problems.
However, the output plot was the same as the plot of coefficients obtained by the function "plot(model$top.benefit)".
We would appreciate it if you could tell us the best way to do this.
Sincerely,
I have trained a random survival forest using the R package randomForestSRC. For publication, I would like to visualize some selected trees, preferably using the ggraph package, much like here: https://shiring.github.io/machine_learning/2017/03/16/rf_plot_ggraph
The randomForest package has a convenient function randomForest::getTree, but so far, I have not found an analogue function in in randomForestSRC.
How are the trees stored in the random survival forest, and how I can access them? I'd be grateful for any hints!
I have made SVM model using SVM package in R for a classification problem. I got only 87% accuracy. But random forest produces around 92.4%.
fit.svm<-svm(modelformula, data=training, gamma = 0.01, cost = 1,cross=5)
Would like to use boosting for tuning this SVM model. Can someone will help me to tune this SVM model?
What are the best parameters I can provide for SVM method?
Example for booting for SVM model.
To answer your first question.
The e1071 library in R has a built-in tune() function to perform CV. This will help you select the optimal parameters cost, gamma, kernel. You can also manipulate a SVM in R with the package kernlab. You may get different results from the 2 libraries. Let me know if you need any examples.
You may want to look into the caret package. It allows you to both pick various kernels for SVM (model list) and also run parameter sweeps to find the best model.
I can see how cv.glm work with a glm object, but what about fitted survival models?
I have a bunch of models (Weibull, Gompertz, lognormal, etc). I want to assess the prediction error using cross validation. Which package/function can do this in R?
SuperLearner can do V-fold cross-validation for a large library of underlying machine learning algorithms, not sure that it includes survival models. Alternatively, take a look at the cvTools package, which is designed to help do cross-validation of any prediction algorithm you give it.
I've been using the caret package in R to run some boosted regression tree and random forest models and am hoping to generate prediction intervals for a set of new cases using the inbuilt cross-validation routine.
The trainControl function allows you to save the hold-out predictions at each of the n-folds, but I'm wondering whether unknown cases can also be predicted at each fold using the built-in functions, or whether I need to use a separate loop to build the models n-times.
Any advice much appreciated
Check the R package quantregForest, available at CRAN. It can easily calculate prediction intervals for random forest models. There's a nice paper by the author of the package, explaining the backgrounds of the method. (Sorry, I can't say anything about prediction intervals for BRT models; I'm looking for them by myself...)