I have a glm model with multiple terms. I need to plot the roc and find the auc. I tried using roc() and multiclass.roc() but get Error in plot.new() : figure margins too large.
library(AER)
data("Affairs")
str(Affairs)
Affairs$affairs <- as.factor(Affairs$affairs)
m3 <- glm( affairs ~ gender+age+yearsmarried+religiousness+rating, family =
binomial, data = Affairs)
honk <- roc(affairs ~ gender+age+yearsmarried+religiousness+rating, data = Affairs)
plot(honk)
honk$auc
I'm working on an optimization of a logistic regression model made with glm, the optimization is a lasso regression using glmnet. I want to compare both models using the output of a Hosmer Lemeshow test and I get this output.
For the glm I get
> hl <- hoslem.test(trainingDatos$Exited, fitted(logit.Mod))
> hl
Hosmer and Lemeshow goodness of fit (GOF) test
data: trainingDatos$Exited, fitted(logit.Mod)
X-squared = 2.9161, df = 8, p-value = 0.9395
And when I try to run the test for the lasso regression I get
> hll <- hoslem.test(trainingDatos$Exited, fitted(lasso.model), g=10)
Error in cut.default(yhat, breaks = qq, include.lowest = TRUE) :
'x' must be numeric
I also tried to use the coefficients of the lasso regression to make it numeric and I get
> hll <- hoslem.test(trainingDatos$Exited, fitted(lasso.model$beta), g=10)
Error: $ operator not defined for this S4 class
But when I treat it as an S4
> hll <- hoslem.test(trainingDatos$Exited, fitted(lasso.model#beta), g=10)
Error in fitted(lasso.model#beta) :
trying to get slot "beta" from an object (class "lognet") that is not an S4 object
Any way to run the test for my lasso regression?
Here is my full code for the lasso regression, can't share the database right now sorry
#Creation of Training Data Set
input_ones <- Datos[which(Datos$Exited == 1), ] #All 1s
input_zeros <- Datos[which(Datos$Exited == 0), ] #All 0s
set.seed(100)
#Training 1s
input_ones_training_rows <- sample(1:nrow(input_ones), 0.7*nrow(input_ones))
#Training 0s
input_zeros_training_rows <- sample(1:nrow(input_zeros), 0.7*nrow(input_ones))
training_ones <- input_ones[input_ones_training_rows, ]
training_zeros <- input_zeros[input_zeros_training_rows, ]
trainingDatos <- rbind(training_ones, training_zeros)
library(glmnet)
#Conversion of training data into matrix form
x <- model.matrix(Exited ~ CreditScore + Geography + Gender
+ Age + Tenure + Balance + IsActiveMember
+ EstimatedSalary, trainingDatos)[,-1]
#Defining numeric response variable
y <- trainingDatos$Exited
sed.seed(100)
#Grid search to find best lambda
cv.lasso<-cv.glmnet(x, y, alpha = 1, family = "binomial")
#Creation of the model
lasso.model <- glmnet(x, y, alpha = 1, family = "binomial",
lambda = cv.lasso$lambda.1se)
coef(cv.lasso, cv.lasso$lambda.1se)
#Now trying to run the test
library(ResourceSelection)
set.seed(12657)
hll <- hoslem.test(trainingDatos$Exited, fitted(lasso.model), g=10)#numeric value error
hll <- hoslem.test(trainingDatos$Exited, fitted(lasso.model$beta), g=10)#$ not defined for S4
hll <- hoslem.test(trainingDatos$Exited, fitted(lasso.model#beta), g=10)#saying that beta is nos S4
glmnet uses a unique predict() method for obtaining fitted values. As rightly mentioned, the errors came from using fitted(). Meanwhile, running such tests could be easier with the gofcat package. Supported objects are passed directly to the functions. Your glm model, for instance, goes hosmerlem(logit.Mod).
I am new to R and I am trying to create a logit model. I created a train and test set for my data and when I am trying to create a logit model, I keep getting the following error message:
model <- glm(mortDefault2001$default ~.,family=binomial(link='logit'),data=train)
Error in model.frame.default(formula = mortDefault2001$default ~ .,
data = train,:variable lengths differ (found for 'creditScore')
What am I doing wrong/what can I do to fix this to run the model?
This is the code I used to create the test and train sets:
data <- subset(mortDefault2001,select=c(1,2,3,4,6))
train <- data[1:80000,]
train <- data[1:80000,]
test <- data[80001:99999,]
model <- glm(mortDefault2001$default ~.,family=binomial(link='logit'),data=train)
Error in model.frame.default(formula = mortDefault2001$default ~ ., data = train, :
variable lengths differ (found for 'creditScore')
pred1 <- prediction(predictions=glm.prob2,labels = train_data)
error: Error in prediction(predictions = glm.prob2, labels = train_data) :
Number of predictions in each run must be equal to the number of labels for each run.
I have used glm model to predict the output and trying to produce pred1 as a variable for plotting the ROC curve.
Here is the full code
View(train_data)
library(ROCR)
pred1 <- prediction(predictions=glm.prob2,labels = train_data)
perf1<-performance(pred1,measure ="TP.rate",x.measure = "FP.rate")
plot(perf1)
I am getting the following error: $ operator is invalid for atomic vectors. I am getting the error when trying to calculate the prediction error for a logistic regression model.
Here is the code and data I am using:
install.packages("ElemStatLearn")
library(ElemStatLearn)
# training data
train = vowel.train
# only looking at the first two classes
train.new = train[1:3]
# test data
test = vowel.test
test.new = test[1:3]
# performing the logistic regression
train.new$y <- as.factor(train.new$y)
mylogit <- glm(y ~ ., data = train.new, family = "binomial")
train.logit.values <- predict(mylogit, newdata=test.new, type = "response")
# this is where the error occurs (below)
train.logit.values$se.fit
I tried to make it of type list but that did not seem to work, I am wondering if there is a quick fix so that I can obtain either the prediction error or the misclassification rate.