Using logit.reg function for logistic regression - r

After loading and partitioning my data successfully I am trying to get the following code to run
*logit.reg <- glm(Financial.Condition ~ ., data = train.df, family = "binomial")
options(scipen = 999)
summary(logit.reg)*
but I am getting the following error message:
*> logit.reg <- glm(Financial.Condition ~ ., data = train.df, family = "binomial")
Error in model.matrix.default(mt, mf, contrasts) :
variable 1 has no levels
> options(scipen = 999)
> summary(logit.reg)
Error in summary(logit.reg) : object 'logit.reg' not found*
What is the proper way to write the logit.reg function?

Related

error in plot function (variable match was not found)

i am trying to plot my decision tree, but it keeps giving me this error, and i can not figure out what is wrong.
here's the code:
install.packages("foreign")
library("foreign")
install.packages("C50")
library(C50)
setwd("C:\\Users\\hp\\Music\\SefSec_2014_HH_weight new.sav")
df <- read.spss("C:\\Users\\hp\\Music\\SefSec_2014_HH_weight new.sav", to.data.frame = TRUE)
View(df)
summary(df)
#summary(df$C12_21)
controlFactor = C5.0Control(sample=0.7,seed=10,CF=0.25,winnow = TRUE ,minCases = 20)
model = C5.0(df[,c(427,432,437,422,447,452,457,462,467,472,482,487,477,492,497,502,503,504,505,506,507,508)], df[,512], control = controlFactor)
na.omit(df)
summary(model)
plot(model,type="s",main="Descision Tree")
and here's the error:
> plot(model,type="s",main="Descision Tree")
Error in FUN(X[[i]], ...) : Variable match was not found.

Retrieving Variable Importance from Caret trained model with "lda2", "qda", "lda"

I can get variable importance out from "nnet" and "knn" models, but not from "lda", "lda2", and "qda".
I am using varImp(). I've tried everything I can think of and just can't get a proper idea of what the variable importance is.
Here is my code for training the model:
lda_model <- train(quality2 ~ .,
data = train_data,
method = "lda",
preProcess = c("center", "scale"),
trControl = trainControl(method = "repeatedcv",
number = 10,
repeats = 2),
importance = TRUE)
and here is the error I get when I try to check importance:
> varImp(lda_model)
Error in model.frame.default(formula = y ~ x, na.action = na.omit, drop.unused.levels = TRUE) :
invalid type (list) for variable 'y'
In addition: Warning messages:
1: In mean.default(y, rm.na = TRUE) :
argument is not numeric or logical: returning NA
2: In Ops.factor(left, right) : ‘-’ not meaningful for factors
I know this means it's treating it as an object class list instead of a trained model, and I've tried it on lda_model$finalmodel and others, but it's still not working.
How can I get proper feedback when using lda/qda on how my model is performing and which variables are performing best?
I had the same problem and it seems to come from the way of the dataset is imported in R. I first imported with the {readxl} package and varImp() didn't work. Then I tried to import throught the clipboard and now varImp is working on my lda model build with {caret}.
My code with {readxl} :
library(readxl)
glauc <- read_excel("Glaucome.xlsx", sheet="GlaucomaM")
rownames(glauc) <- glauc$IDENT
glauc$IDENT <- NULL
glauc$Class <- as.factor(glauc$Class)
library(caret)
numappr <- createDataPartition(glauc$Class, p=0.7)
appr <- glauc[numappr$Resample1,]
test <- glauc[-numappr$Resample1,]
Ctrl <- trainControl(summaryFunction=twoClassSummary,
classProbs=TRUE)
appr.lda <- train(Class~., data=appr, method="lda",
trControl=Ctrl, preProc = c("center","scale"),
metric="ROC")
varImp(appr.lda)
This leads to the same error message as yours.
Error: $ operator is invalid for atomic vectors
In addition: Warning messages:
1: In mean.default(y, rm.na = TRUE) :
argument is not numeric or logical: returning NA
2: In Ops.factor(left, right) : ‘-’ not meaningful for factors
And my code with read.table() and the clipboard :
glauc <- read.table("clipboard", header=T, sep="\t", dec=".")
rownames(glauc) <- glauc$IDENT
glauc$IDENT <- NULL
library(caret)
numappr <- createDataPartition(glauc$Class, p=0.7)
appr <- glauc[numappr$Resample1,]
test <- glauc[-numappr$Resample1,]
Ctrl <- trainControl(summaryFunction=twoClassSummary,
classProbs=TRUE)
appr.lda <- train(Class~., data=appr, method="lda",
trControl=Ctrl, preProc = c("center","scale"),
metric="ROC")
varImp(appr.lda)
This one leads to the result (only the first ones here):
varImp(appr.lda)
ROC curve variable importance
only 20 most important variables shown (out of 62)
Importance
vari 100.00
varg 97.14
vars 94.52
phci 93.69
hic 92.02
phcg 90.55
tms 89.96
Hope it helps.
Sophie

kernlab class prediction calculations failed => with SVMLinear NOT SVMRadial

I got a problem training SVMLinear with caret. The data works just fine with SVMRadial though.
The data is accessible via (29/05/2016):
https://www.dropbox.com/s/ia2vc25uhxdgqn1/projetTest01.txt?dl=0
(8000 lines of 1021 variables, ~10% target)
Here's the code:
projetTest01<-read.table("projetTest01.txt", sep="\t")
Test01<-list(data=projetTest01[,-c(2,3)],label=projetTest01[,3])
Test01N<-Test01
Test01N$label<-as.factor(Test01$label)
levels(Test01N$label)[levels(Test01N$label)=="0"] <- "No"
levels(Test01N$label)[levels(Test01N$label)=="1"] <- "Yes"
temp<-as.matrix(Test01$data)
storage.mode(temp) <- "numeric" #I need 'num' type
Test01N$data<-as.data.frame(temp)
svmTuneGrid_L <- data.frame(.C = 2^(-2:7))
trControl_SVML<-trainControl(method = "repeatedcv", repeats = 3, classProbs = TRUE)
svmFit_Lin <- train(Test01N$label ~ ., data = Test01N$data,method = "svmLinear",preProc = c("center", "scale"), tuneGrid = svmTuneGrid_L,trControl = trControl_SVML)
And I got these messages:
line search fails [..]
Warning in method$predict(modelFit = modelFit, newdata = newdata, submodels = param) :
kernlab class prediction calculations failed; returning NAs
Warning in data.frame(..., check.names = FALSE) :
row names were found from a short variable and have been discarded
I looked up the site/the web for some answers, but
the levels aren't numeric (=yes/no)
the ClassProb is set to TRUE
the labels can't be predicted perfectly from another variable (I know this from other algorithms)
there isn't a empty class
preproc(scale) or not doesn't make a difference
And the data works just fine with SVMRadial!!
I use caret 6.0-68
I really am at a loss. An idea someone?

Error in numInClass[i]%/%k : non-numeric argument to binary operator

I cannot thrive to train a model with the caret package without errors.
Here is a minimum reproductible data set: Dataset
And here is the code I'm using...
data <- read.csv('file.csv', header = TRUE, sep = ',')
library(caret)
data <- data[complete.cases(data), ]
data$donation <- as.factor(data$donation)
set.seed(1337)
ctrl <- trainControl(
method="repeatedcv",
number="10",
repeats=10)
data <- data[,-2]
nnet.fit <- train(donation ~ ., data = data,
method = "nnet",
trControl = ctrl,
verbose = FALSE,
tuneLength = 5)
...that leads to this error:
Error in numInClass[i]%/%k : non-numeric argument to binary operator
I have checked this answer but didn't understand how to solve my problem with it.
Any clue ?
Just seen this while trying to find a solution on the same. It's a very simple answer bro:
ctrl <- trainControl(
method="repeatedcv",
number="10",# Here the 10 should not be wrapped in quotation marks as its numeric
repeats=10)

Using your own model in train (caret package)?

I am trying to use train from Caret with a package which is not included, and I get an error that I don't manage to figure out, any idea ? I used the following link to get started
bmsMeth<-list(type="Regression",library="BMS",loop=NULL,prob=NULL)
prm<-data.frame(parameter="mprior.size",class="numeric",label="mprior.size")
bmsMeth$parameters<-prm
bmsGrid<-function(x,y,len=NULL){
out<-expand.grid(mprior.size=seq(2,3,by=len))
out
}
bmsMeth$grid<-bmsGrid
bmsFit<-function(x,y,param, lev=NULL) {bms(cbind(y,x),burn=5000,iter=100000,nmodel=1000,mcmc="bd",g="UIP",mprior.size=param$mprior.size)}
bmsMeth$fit<-bmsFit
bmsPred<-function(modelFit,newdata,preProcess=NULL,submodels=NULL){predict(modelFit,newdata)}
bmsMeth$predict<-bmsPred
library(caret)
data.train<-data.frame(runif(100),runif(100),runif(100),runif(100),runif(100))#synthetic data for testing
bms(cbind(data.train[,1],data.train[,-1]),burn=5000,iter=100000,nmodel=1000,mcmc="bd",g="UIP",mprior.size=2)#function out of caret is working
preProcess=c('center','scale')
myTimeControl <- trainControl(method = "timeslice",initialWindow = 0.99*nrow(data.train), horizon = 1, fixedWindow = FALSE)
tune <- train(data.train[,-1],data.train[,1],preProcess=preProcess,method = bmsMeth,tuneLength=2,metric= "RMSE",trControl =myTimeControl,type="Regression")
Error I get :
Error in train.default(data.train[, -1], data.train[, 1], preProcess =
preProcess, : Stopping In addition: Warning messages: 1: In
eval(expr, envir, enclos) : model fit failed for Training1:
mprior.size=2 Error in method$fit(x = x, y = y, wts = wts, param =
tuneValue, lev = obsLevels, : unused arguments (wts = wts, last =
last, classProbs = classProbs, type = "Regression")
2: In nominalTrainWorkflow(x = x, y = y, wts = weights, info =
trainInfo, : There were missing values in resampled performance
measures.
Apparantly, I just had to put the arguments in the function even if I never use them :
bmsFit<-function(x,y,param, lev=NULL, last, weights, classProbs, ...) {bms(data.frame(y,x),burn=5000,iter=100000,nmodel=1000,mcmc="bd",g="UIP",mprior.size=param$mprior.size)}
Your function bms() does not seem to exist ...

Resources