I've been using randomForest.getTree to give me a text representation of each tree in my forest but have recently switched to the caret package to train the forests (method='rf'). How can I either get an object that randomForest.getTree understands (since caret is allegedly using the same underlying code) or print out the trees in some other analogous way?
just figured it out:
library(caret)
.... #load data
model <- train(x,y,method='rf')
getTree(model$finalModel)
Related
I have built an SVM-RBF model in R using Caret. Is there a way of plotting the decisional boundary?
I know it is possible to do so by using other R packages but unfortunately I’m forced to use the Caret package because this is the only package I found that allows me to calculate the variables importance.
In alternative, can you suggest a package that allows to plot the decision boundaries AND gives also the vars importance?
Thank you very much
First of all, unlike other methods, SVM does not produce feature importance. In your case, the importance score caret reports is calculated independent of the method itself: https://topepo.github.io/caret/variable-importance.html#model-independent-metrics
Second, the decision boundary (or hyperplane) you see in most textbook example is based on a toy problem with only two or three features. If you have more than three features, it is not trivial to visualize this hyperplane.
I am newbie in R and I need to know how to plot a tree selected from a random forest training model created using the train () function in caret package.
First and foremost, I used a training dataset to create a fitting model of a random forest using the train() function. The created random forest contains about 500 trees. Is there any methodology to create a plot of a selected tree?
Thank you.
CRAN package party offers a method called prettyTree.
Look here
As far as I know, the randomForest package does not have any built-in functionality to plot individual trees. You can extract trees using the getTree() function, but nothing is provided to plot / visualize it. This question may be a duplicate as a quick search yielded approaches other people have used to extract trees from a random forest are found
here and here and here
I have trained a Random Forest classifier (randomForest package) and it returns also the confusion matrix.
I want to compute sensitivity and specificity for each class, so I decided to load the confusion matrix with the caret package.
However I did not find how to load it. How can I achieve it?
Thanks to Sowmya's comments, I achieve it by using the following code:
library(caret)
confusionMatrix(clf$predicted, labels)
where clf is the result of the execution of randomForest() call, while labels is the lists of reference labels of the output variable.
I am trying to model a train data with caret package's classifiers, but it does not respond for a very long time (I have waited for 2 hours). On the other hand, it works for other datasets.
Here is the link to my train data: http://www.htmldersleri.org/train.csv (It is well-known Reuters-21570 data set)
And the command I am using is:
model<-train(class~.,data=train,method="knn")
Note: for any other method (eg: svm, naive bayes, etc.), is stucks anyway.
Note 2: For package e1071, naiveBayes classifier works, but with 0,08% accuacy!
Can anyone tell me what can be the problem? Thanks in advance.
This seems to be multiclass classification problem. I'm not sure if caret supports that. However, I can show you how you would do the same thing with the mlr package
library(mlr)
x <- read.csv("http://www.htmldersleri.org/train.csv")
tsk <- makeClassifTask(data = x, target = 'class')
#Assess the performane with 10-fold cross-validation
crossval('classif.knn', tsk)
If you want to know which learners are integrated in mlr that support this kind of task, type
listLearners(tsk)
I'm using the "party" package to create random forest of regression trees.
I've created a ForestControl class in order to limit my number of trees (ntree), of nodes (maxdepth) and of variables I use to fit a tree (mtry).
One thing I'm not sure of is if the cforest algo is using subsets of my training set for each tree it generates or not.
I've seen in the documentation that it is bagging so I assume it should. But I'm not sure to understand well what the "subset" input is in that function.
I'm also puzzled by the results I get using ctree: when plotting the tree, I see that all my variables of my training set are classified in the different terminal tree nodes while I would have exepected that it only uses a subset here too.
So my question is, is cforest doing the same thing as ctree or is it really bagging my training set?
Thanks in advance for you help!
Ben