extract coefficients within R caret - r

library(caret)
data(iris)
train_control <- trainControl(method="repeatedcv", number=10, repeats=10)
model <- train(Sepal.Length~Sepal.Width+Petal.Length+Petal.Width, data=iris, trControl=train_control, method="lm")
I can get the coefficients of the final selected model with model$finalModel$coefficients. Is there any way to get the coefficients for all models?

Related

DALEX for classification problems

I built a logistic regression model with penalty with caret and then i try to create an object through DALEX::explain to subsequently analyze the various aspects of the model.
Perhaps the problem lies in having a binary classification model.
Here my reproducible code
library(DALEX)
library(modelStudio)
set.seed(10)
data<- as.data.frame(mtcars)
data$vs<- as.factor(data$vs)
set.seed(10)
trc<- trainControl(method = "repeatedcv", number=3,repeats=4, classProbs=FALSE)
library(caret)
model <- caret::train(vs~., data=data, trControl= trc, family="binomial", method = "regLogistic")
explainer<- DALEX::explain(
model = model,
data= as.data.frame(data[, -which(colnames(data) %in% "vs")]),
y = as.numeric(as.character(data$vs)),
predict_function = predict,
label = "regLogistic")
modelStudio::modelStudio(explainer)

Extract the Intercept from a Caret LASSO Model

I'm using the caret package in R to fit a LASSO regression model. My code runs fine, however I would like to extract the Intercept for the final model so I can build a scoring key using the selected predictors and coefficients.
For example, if "Extraversion" is the variable I am trying to model using survey items, I would like to produce the following scoring key:
Intercept + Survey_Item_1*Slope + Survey_Item_2*Slope + and so on
FWIW, I am able to extract the coefficients for the predictors.
My code for reference:
##Create Training & test set
set.seed(9808)
ind <- sample(0:1, nrow(df), replace=T, prob=c(.75,.25))
train <- df[ind==0,]
test <- df[ind==1,]
ctrl <- trainControl(method = "repeatedcv", number=5, repeats = 5)
##Train Lasso model
fit.lasso <- train(Extraversion ~., , data=train, method="lasso", preProc=c('scale','center','nzv'), trControl=ctrl)
fit.lasso
predict.enet(fit.lasso$finalModel, type='coefficients', s=fit.lasso$bestTune$fraction, mode='fraction')
##Fit models to test data
lasso_test<- predict(fit.lasso, newdata=test, na.action="na.pass")
postResample(pred = lasso_test, obs = test[,c(1)])

Random Forest Accuracy

I am using Random forest algorithm to predict target variable "Y" have 4 values
Below syntax is used to create model
control <- trainControl(method="repeatedcv", number=2, repeats=1, search="random")
seed <- 7
metric <- "Accuracy"
set.seed(seed)
mtry <- sqrt(ncol(train))
model <- train(Target~., data=complete, method="rf", metric=metric, tuneLength=15, trControl=control)
But, when I test trained model on test dataset it does gives accuracy close to 50% only , is there any way in which accuracy can be increased close 70% and above?

run caret method without parameters

I'm trying to run a caret method that not requires parameters, such as lda, the example below uses "lvq" which needs 2 parameters (size and k)
set.seed(7)
# load the library
library(caret)
# load the dataset
data(iris)
# prepare training scheme
control <- trainControl(method="repeatedcv", number=10, repeats=3)
# design the parameter tuning grid
grid <- expand.grid(size=c(5,10,20,50), k=c(1,2,3,4,5))
# train the model
model <- train(Species~., data=iris, method="lvq", trControl=control, tuneGrid=grid)
# summarize the model
print(model)
plot(model)
I tried to work it out assigning tuneGrid=NULL
set.seed(7)
# load the library
library(caret)
# load the dataset
data(iris)
# prepare training scheme
control <- trainControl(method="repeatedcv", number=10, repeats=3)
# design the parameter tuning grid
grid <- expand.grid(size=c(5,10,20,50), k=c(1,2,3,4,5))
# train the model
model <- train(Species~., data=iris, method="lda", trControl=control, tuneGrid=NULL)
# summarize the model
print(model)
plot(model)
But I get the error
There are no tuning parameters for this model
Caret contains a number of LDA methods like:
method = "lda" involves no tuning parameters.
method = "lda2" allows to tune dimen (number of discriminant vectors).
If you want to tune parameters (and that must be only number of discriminant vectors), you must use "lda2". "lda" do not allows tuning so to run it you must delete tuneGrid. Deleting tuneGrid you just switch off cross-validation.
I'll answer my own question, I think that just deleting the tuneGrid=NULL works fine
set.seed(7)
# load the library
library(caret)
# load the dataset
data(iris)
# prepare training scheme
control <- trainControl(method="repeatedcv", number=10, repeats=3)
# design the parameter tuning grid
grid <- expand.grid(size=c(5,10,20,50), k=c(1,2,3,4,5))
# train the model
model <- train(Species~., data=iris, method="lda", trControl=control)
# summarize the model
print(model)

Fitting models with class probabilities with caret in R?

I'm working on making some predictions with stacked ML algorithms in R, and I have successfully prepared the sub-models (see working code below:
trainSet <- read.csv("train.csv")
testSet <- read.csv("test.csv")
trainSet$Survived <- as.factor(trainSet$Survived)
algorithmList <- c('lda', 'rpart', 'glm', 'knn', 'svmRadial')
# create submodels
control <- trainControl(method="repeatedcv", number=10, repeats=3, savePredictions=TRUE, classProbs=TRUE)
set.seed(seed)
models <- caretList(Survived~ Pclass + Sex + Fare, data=trainSet, trControl=control, methodList=algorithmList)
results <- resamples(models)
summary(results)
dotplot(results)
but when I actually go to stack the sub-models:
# stack using glm
stackControl <- trainControl(method="repeatedcv", number=10, repeats=3, savePredictions=TRUE, classProbs=TRUE)
set.seed(seed)
stack.glm <- caretStack(models, method="glm", metric="Accuracy", trControl=stackControl)
print(stack.glm)
It gives me the error message:
Error in check_caretList_model_types(list_of_models) :
The following models were fit by caret::train with no class probabilities: lda, rpart, glm, knn, svmRadial.
Please re-fit them with trainControl(classProbs=TRUE)
But, as you can see, I believe I actually did fit them with classProbs=TRUE (see my 'control' variable) and don't understand why I'm getting this error message! Any ideas?

Resources