Only calling a Binomial model - r

I am under the assumption that there are issues just calling a binomial model in R without specifying the family type (logit/probit) right?
model <- glm(y ~ x, family = binomial, data = data)
vs
model_2 <- glm(y ~ x, binomial(link = "logit"), data = data)

Related

EXtract the AIC value from a list of survival models

I have the following list of model
list_model <- list(fit1, fit2, fit.coxph1, fit.coxph2)
fit1 and fit2 have been realized as follows:
surv_object <- Surv(time = data$`Last observation (days)`, event = data$Death)
surv_object
fit1 <- survfit(surv_object ~ Treatment, data = data)
summary(fit1)
fit2 <- survfit(surv_object ~ Treatment + data$`Age (years)_cat`, data = data)
summary(fit2)
While the last ones are Cox models:
fit.coxph1 <- coxph(surv_object ~ Treatment,
data = data)
fit.coxph1
ggforest(fit.coxph1, data = data)
fit.coxph2 <- coxph(surv_object ~ Treatment + `Age (years)_cat`,
data = data)
fit.coxph2
Does anyone know how to extract iteratively the AIC values from them?
thanks

GLM Family using tidymodels

I am trying to use the tidymodels package for a GLM and want to use the Gamma or Poisson distribution.
Using glm I would use something like the following
# using glm
mdl <- glm(data = data, y ~ x, family = Gamma(link = "inverse"))
mdl <- glm(data = data, y ~ x, family = poisson(link = "log"))
# using glmnet
library(glmnet)
mdl <- glmnet(data$x, data$y, family = Gamma(link = "inverse"))
mdl <- glmnet(data$x, data$y, family = poisson(link = "log"))
How can I achieve the same using tidymodels? Note that I am trying to do a regression and not a classification (logistic regression) for which I could use parsnip::logistic_reg().
I found one article on Generalized Linear Models on tidymodels, which belongs to the embed package but does not show how to specify the family.
I would expect something similar to this (which does not work as neither linear_reg has the parameters family or link, nor does set_engine support glm in linear regression mode)
mdl <- linear_reg(mode = "regression", family = "gamma", link = "inverse") %>% set_engine("glm") # or glmnet
That was easier than expected:
mdl <- linear_reg(mode = "regression") %>%
set_engine("glmnet", family = "gamma")
# or
mdl <- linear_reg(mode = "regression") %>%
set_engine("glmnet", family = Gamma(link = "inverse"))

Fit a no intercept binary model in caret [duplicate]

Performing logistic regression in R using the caret package and trying to force a zero intercept such that probability at x=0 is .5. In other forms of regression, it seems like you can turn the intercept off using tunegrid, but that has no functionality for logistic regression. Any ideas?
model <- train(y ~ 0+ x, data = data, method = "glm", family = binomial(link="probit"),
trControl = train.control)
And yes, I "know" that the probability at x=0 should be .5, and thus trying to force it.
There's a vignette on how to set up a custom model for caret. So in the solution below, you can also see why the intercept persist:
library(caret)
glm_wo_intercept = getModelInfo("glm",regex=FALSE)[[1]]
if you look at the fit, there's a line that does:
glm_wo_intercept$fit
....
modelArgs <- c(list(formula = as.formula(".outcome ~ ."), data = dat), theDots)
...
So the intercept is there by default. You can change this line and run caret on this modified model:
glm_wo_intercept$fit = function(x, y, wts, param, lev, last, classProbs, ...) {
dat <- if(is.data.frame(x)) x else as.data.frame(x)
dat$.outcome <- y
if(length(levels(y)) > 2) stop("glm models can only use 2-class outcomes")
theDots <- list(...)
if(!any(names(theDots) == "family"))
{
theDots$family <- if(is.factor(y)) binomial() else gaussian()
}
if(!is.null(wts)) theDots$weights <- wts
# change the model here
modelArgs <- c(list(formula = as.formula(".outcome ~ 0+."), data = dat), theDots)
out <- do.call("glm", modelArgs)
out$call <- NULL
out
}
We fit the model:
data = data.frame(y=factor(runif(100)>0.5),x=rnorm(100))
model <- train(y ~ 0+ x, data = data, method = glm_wo_intercept,
family = binomial(),trControl = trainControl(method = "cv",number=3))
predict(model,data.frame(x=0),type="prob")
FALSE TRUE
1 0.5 0.5

Plotting a GLM Model

I am facing trouble with plotting the predictions of a glm model . When i run the below code, R draws up an empty plot.
Logistic regression
model2 = glm(as.factor(loan_status) ~ . , data = train , family = binomial(link = 'logit'))
summary(model2)
Prediction
pred1 <- predict(model2,test,type = 'response')
ggplot(data.frame(pred1), aes(pred1))
train dataset consists of categorical and numerical values
Call:
glm(formula = as.factor(loan_status) ~ ., family = binomial(link = "logit"),
data = train)
Appreciate any assistance
Thank you

Run HLM mediation in R

I try to run HLM mediation with the "mediation" package:
med.fit <- glmer(M ~ treat + control + (1|subject_id) ,family = binomial(link = "logit"), data = R1_data)
out.fit <- glmer(Y ~ M+ treat + control+ (1 + M|subject_id),family = binomial(link = "logit"), data = R1_data)
med.out <- mediate(med.fit, out.fit, treat = "treat", mediator = "M", sims = 1000)
I got this error message:
Error in [.data.frame(y.data, int.term.name[p]) : undefined columns selected
How to solve this problem?
Here is the original data and code:
names(R1_data)
[1] "subject_id"
[3] "Presented_is_solvable"
[5] "JOS"
[17] "Answer_JOS"
[23] "Matrix_Z_score"
library(mediation)
library(lme4)
med.fit <- glmer(JOS ~ Matrix_Z_score + Presented_is_solvable + (1|subject_id) ,family = binomial(link = "logit"), data = R1_data)
out.fit <- glmer(Answer_JOS ~ JOS + Matrix_Z_score +Presented_is_solvable + (1 + JOS|subject_id),family = binomial(link = "logit"), data = R1_data)
med.out <- mediate(med.fit, out.fit, treat = "Matrix_Z_score", mediator = "JOS", sims = 1000)
Figured out that this happens when treatment or mediator data is classified as factor data in R. The mediate function can't properly locate the names of those variables from the fitted models as in the models, they are displayed as "variablename"+factor level.
The solution is to make sure those variables are classified as integers. You can take a look at the variable classifications in the student data set within the mediation package.

Resources