Plot ROC in R for a GLM model - r

pred1 <- prediction(predictions=glm.prob2,labels = train_data)
error: Error in prediction(predictions = glm.prob2, labels = train_data) :
Number of predictions in each run must be equal to the number of labels for each run.
I have used glm model to predict the output and trying to produce pred1 as a variable for plotting the ROC curve.
Here is the full code
View(train_data)
library(ROCR)
pred1 <- prediction(predictions=glm.prob2,labels = train_data)
perf1<-performance(pred1,measure ="TP.rate",x.measure = "FP.rate")
plot(perf1)

Related

Getting Confidence Intervals from predicted values from a nlme model from package medrc

I am trying to figure out how to get confidence intervals from predicted values from a model run on medrc (nlme model). The code worked on the regular drc package model, which does not use random effects, so I assume there is something I am not doing right with this nlme model to get CI because I am getting errors.
Below is an example data frame of the data I am using
df <- data.frame(Geno = c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,7,7,7,7,8,8,8,8,
9,9,9,9,10,10,10,10,11,11,11,11,12,12,12,12,13,13,13,13,14,14,14,14),
Treatment = c(3,6,9,"MMM",3,6,9,"MMM",3,6,9,"MMM",3,6,9,"MMM",3,6,9,"MMM",3,6,9,"MMM",
3,6,9,"MMM",3,6,9,"MMM",3,6,9,"MMM",3,6,9,"MMM",3,6,9,"MMM",3,6,9,"MMM",
3,6,9,"MMM",3,6,9,"MMM"),
Temp = c(32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535,
32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535,
32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535,
32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535,
32.741,34.628,37.924,28.535,32.741,34.628,37.924,28.535),
PAM = c(0.62225,0.593,0.35775,0.654,0.60625,0.5846667,0.316,0.60875,0.62275,0.60875,0.32125,
0.63725,0.60275,0.588,0.32275,0.60875,0.65225,0.6185,0.29925,0.64525,0.61925,0.61775,
0.11725,0.596,0.603,0.6065,0.2545,0.59025,0.586,0.5895,0.27025,0.59125,0.6345,0.6135,
0.3755,0.622,0.53375,0.552,0.2485,0.51925,0.6375,0.6256667,0.3575,0.63975,0.59375,0.6055,
0.333,0.64125,0.55275,0.51025,0.319,0.55725,0.6375,0.64725,0.348,0.66125))
df$Geno <- as.factor(df$Geno)
With this data, I am running this model that has 3 parameters for the dose-response curve model, b =slope, d= max, e= ED50.
model <- medrm(PAM ~ Temp,
data=df,
random= d + e ~ 1|Geno,
fct=LL.3(),
control=nlmeControl(msMaxIter = 2000, maxIter=2000, minScale=0.00001, tolerance=0.1, pnlsTol=1))
summary(model)
plot(model)
From this model I want to make prediction values for different temperatures along the model
model_preddata = data.frame(Temp = seq(28,39, length.out = 100))
model_pred = as.data.frame(predict(model, newdata = model_preddata, interval = 'confidence'))
with this I get an error but I can make it predict the PAM values if I add this
model_pred = as.data.frame(predict(model, newdata = model_preddata, interval = 'confidence', level = 0))
However this does not give me the lower and upper bounds columns like it does when I run this code with other non mixed effect models.
Can anyone help me figure out how to get the CI from the predicted values of this model

How to plot the roc of a glm model with multiple terms in R?

I have a glm model with multiple terms. I need to plot the roc and find the auc. I tried using roc() and multiclass.roc() but get Error in plot.new() : figure margins too large.
library(AER)
data("Affairs")
str(Affairs)
Affairs$affairs <- as.factor(Affairs$affairs)
m3 <- glm( affairs ~ gender+age+yearsmarried+religiousness+rating, family =
binomial, data = Affairs)
honk <- roc(affairs ~ gender+age+yearsmarried+religiousness+rating, data = Affairs)
plot(honk)
honk$auc

plotting semivariograms with non-nlme package models

I am trying to plot a semivariogram of my model residuals for a generalised mixed effect model in R. Doing this for a mixed effect model with normal distribution is straightforward with the nlme package, and using the quakes dataset as an example.
library(nlme)
data(quakes)
head(quakes)
model1 <- lme(mag ~ depth , random = ~1|stations, data = quakes)
summary(model1)
semivario <- Variogram(model1, form = ~long+lat,resType = "normalized")
plot(semivario, smooth = TRUE)
I want to create a model with a non-normal distribution, which I can't do with nlme, so I have tried glmer and glmmPQL. I have turned the 'mag' into a binomial variable, then try to reapply the Variogram function to make a plot with models.
quakes$thresh <- ifelse(quakes$mag > "5", 0, 1)
library(MASS)
model2 <- glmmPQL(as.factor(thresh) ~ depth , random = ~1|stations, family = binomial, data = quakes)
summary(model2)
semivario <- Variogram(model2, form = ~long+lat,resType = "normalized")
plot(semivario, smooth = TRUE)
library(lme4)
model3 <-glmer(as.factor(thresh) ~ depth + (1|stations), data = quakes, family = binomial)
summary(model3)
semivario <- Variogram(model3, form = ~long+lat,resType = "normalized")
plot(semivario, smooth = TRUE)
Neither of these appear to work for plotting the variogram. The glmmPQL says that lat and long isn't found, and the glmer says distance isn't specified.
How can I code a plot of semivariogram of these models? Is the Variogram function from the nlme package unusable for them? And if so what alternatives can I use?

Error when calculating prediction error for logistic regression model

I am getting the following error: $ operator is invalid for atomic vectors. I am getting the error when trying to calculate the prediction error for a logistic regression model.
Here is the code and data I am using:
install.packages("ElemStatLearn")
library(ElemStatLearn)
# training data
train = vowel.train
# only looking at the first two classes
train.new = train[1:3]
# test data
test = vowel.test
test.new = test[1:3]
# performing the logistic regression
train.new$y <- as.factor(train.new$y)
mylogit <- glm(y ~ ., data = train.new, family = "binomial")
train.logit.values <- predict(mylogit, newdata=test.new, type = "response")
# this is where the error occurs (below)
train.logit.values$se.fit
I tried to make it of type list but that did not seem to work, I am wondering if there is a quick fix so that I can obtain either the prediction error or the misclassification rate.

Predict linearRidge with dummy variable

I am trying to do a ridge regression using the codes below with GenCont data in the library ridge
library(ridge)
data(GenCont)
GenCont_df <- as.data.frame(GenCont)
GenCont_df$SNP1 <- as.factor(GenCont_df$SNP1)
mod2 <- linearRidge(Phenotypes ~ SNP1+SNP2, data = GenCont_df)
predict(mod2, GenCont_df, na.action = na.pass, all.coef = FALSE,scaling ="scale")
But if I used dummy variables in the model I get this error
Error in X[, ll] : subscript out of bounds
Is there a way to predict dummy variables in Ridge regression in R?

Resources