Syntax for glmer function for use with glmulti? - r

Using glmer, I can run a logistic regression mixed model just fine. But when I try to do the same using glmulti, I get errors (described below). I think the problem is with the function I am specifying for use in glmulti. I want a function that specifies a logistic regression model for data containing continuous fixed covariates and categorical random effects, using a logit link. The response variable is a binary 0/1.
Sample data:
library(lme4)
library(rJava)
library(glmulti)
set.seed(666)
x1 = rnorm(1000) # some continuous variables
x2 = rnorm(1000)
x3 = rnorm(1000)
r1 = rep(c("red", "blue"), times = 500) #categorical random effects
r2 = rep(c("big", "small"), times = 500)
z = 1 + 2*x1 + 3*x2 +2*x3
pr = 1/(1+exp(-z))
y = rbinom(1000,1,pr) # bernoulli response variable
df = data.frame(y=y,x1=x1,x2=x2, x3=x3, r1=r1, r2=r2)
A single glmer logistic regression works just fine:
model1<-glmer(y~x1+x2+x3+(1|r1)+(1|r2),data=df,family="binomial")
But errors occur when I try to use the same model structure through glmulti:
# create a function - I think this is where my problem is
glmer.glmulti<-function(formula, data, family=binomial(link ="logit"), random="", ...){
glmer(paste(deparse(formula),random),data=data,...)
}
# run glmulti models
glmulti.logregmixed <-
glmulti(formula(glmer(y~x1+x2+x3+(1|r1)+(1|r2), data=df), fixed.only=TRUE), #error w/o fixed.only=TRUE
data=df,
level = 2,
method = "g",
crit = "aicc",
confsetsize = 128,
plotty = F, report = F,
fitfunc = glmer.glmulti,
family = binomial(link ="logit"),
random="+(1|r1)","+(1|r2)", # possibly this line is incorrect?
intercept=TRUE)
#Errors returned:
singular fit
Error in glmulti(formula(glmer(y ~ x1 + x2 + x3 + (1 | r1) + (1 | r2), :
Improper call of glmulti.
In addition: Warning message:
In glmer(y ~ x1 + x2 + x3 + (1 | r1) + (1 | r2), data = df) :
calling glmer() with family=gaussian (identity link) as a shortcut to lmer() is deprecated; please call lmer() directly
I've tried various changes to the function, and within the formula and fitfunc portion of the glmulti code. I've tried substituting lmer for glmer and I guess I don't understand the error. I'm also afraid that calling lmer may change the model structure, as during one of my attempts the summary() of the model stated "Linear mixed model fit by REML ['lmerMod']." I need the glmulti models to be the same as what I'm obtaining with model1 using glmer (ie summary(model1) gives "Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']"
Many similar questions remain unanswered. Thanks in advance!
Credit:
sample data set created with help from here:
https://stats.stackexchange.com/questions/46523/how-to-simulate-artificial-data-for-logistic-regression
glmulti code adapted from here:
Model selection using glmulti

Related

How to calculate McFadden's Pseudo-R^2 from a pooled (after multiple imputation) weighted quasibinomial logistic model in R?

I am trying to fit a quasibinomial logistic model in R (8 predictor variables). I used multiple imputation with the mice package, 5 iterations. I have analytic weights that I am integrating the the regression model as well. I wanted to calculate McFadden's Peudo-R^2 in these circumstances, but I wasn't sure how to go about this, since the model is both pooled and weighted.
Here is the code for the imputation.
imp <- mice(data, maxit = 5, predictorMatrix = predM, method = meth, print = TRUE)
data_imputed <- complete(data, action="long", include = TRUE)
data_imputed <- as.mids(data_imputed)
Here is the code for my model. I have two continuous predictors, four binary predictors, and two categorical predictors. The analytic weights are non-integers and are preprovided in the dateset I'm working with.
model <- with(data_imputed, glm(Y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8, weights = wt, family = quasibinomial(link = "logit")))
I tried using the pscl package with the pR2(), without success.
The mice package also has a pool.r.squared() function, but it's only for linear regressions and doesn't give any pseudo-R^2.
As I am not the most familiar with quasibinomial logistic regressions (I am fitting one because the standard logistic regression can't handle non-integer weights to my understanding), perhaps the McFadden pseudo-R2 isn't even good to compute. Any enlightenment is appreciated.

Probing interactions in nlme using the "interactions" package in R

I am running a linear mixed effects models using the "nlme" package looking at stress and lifestyle as predictors of change in cognition over 4 years in a longitudinal dataset. All variables in the model are continuous variables.
I am able to create the model and get the summary statistics using this code:
mod1 <- lme(MS ~ age + sex + edu + GDST1*Time + HLI*Time + GDST1*HLI*Time, random= ~ 1|ID, data=NuAge_long, na.action=na.omit)
summary(mod1)
I am trying to use the "interactions" package to probe the 3-way interaction:
sim_slopes(model = mod1, pred = Time, modx = GDST1, mod2 = HLI, data = NuAge_long)
but am receiving this error:
Error in if (tcol == "df") tcol <- "t val." : argument is of length zero
I am also trying to plot the interaction using the same "interactions" package:
interact_plot(model = mod1, pred = Time, modx = GDST1, mod2 = HLI, data = NuAge_long)
and am receiving this error:
Error in UseMethod("family") : no applicable method for 'family' applied to an object of class "lme"
I can't seem to find what these errors mean and why I'm getting them. Any help would be appreciated!
From ?interactions::sim_slopes:
The function is tested with ‘lm’, ‘glm’,
‘svyglm’, ‘merMod’, ‘rq’, ‘brmsfit’, ‘stanreg’ models. Models
from other classes may work as well but are not officially
supported. The model should include the interaction of
interest.
Note this does not include lme models. On the other hand, merMod models are those generated by lme4::[g]lmer(), and as far as I can tell you should be able to fit this model equally well with lmer():
library(lme4)
mod1 <- lmer(MS ~ age + sex + edu + GDST1*Time + HLI*Time + GDST1*HLI*Time
+ (1|ID), data=NuAge_long)
(things will get harder if you want to specify correlation structures, e.g. correlation = corAR1(), which works for lme() but not lmer() ...)

Estimating risk ratio instead of odds ratio in mixed effect logistic regression in `R`

glmer is used to estimate effects on the logit scale of y when the data are clustered. In the following model
fit1 = glmer(y ~ treat + x + ( 1 | cluster), family = binomial(link = "logit"))
the exp of the coefficient of treat is the odds ratio of a binary 0-1 treatment variable, x is a covariate, and cluster is a clustering indicator across which we model a random effect (intercept). A standard approach in glm's to estimate risk ratios is to use a log link instead, i.e. family=binomial(link = "log"). However using this in glmer I get error
Error in (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, maxit = 100L, :
(maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate
after calling
fit1 = glmer(y ~ treat + x + ( 1 | cluster), family = binomial(link = "log"))
A web search revealed other people had similar issues with the Gamma family.
This seems to be a general problem as the reproducible example below demonstrates. My question thus is: how can I estimate risk ratios using a mixed effect model like glmer?
Reproducible Example
The following code simulates data that replicates the problem.
n = 1000 # sample size
m = 50 # number of clusters
J = sample(1:m, n, replace = T) # simulate cluster membership
x = rnorm(n) # simulate covariate
treat = rbinom(n, 1, 0.5) # simulate random treatment
u = rnorm(m) # simulate random intercepts
lt = x + treat + u[J] # compute linear term of logistic mixed effect model
p = 1/(1+exp(-lt)) # use logit link to transform to probabilities
y = rbinom(n,1,p) # draw binomial outcomes
d = data.frame(y, x, treat)
# First fit logistic model with glmer
fit1 = glmer( y ~ treat + x + (1 | as.factor(J)),
family = binomial(link = "logit"), data = d)
summary(fit1)
# Now try to log link
fit2 = glmer( y ~ treat + x + (1 | as.factor(J)),
family = binomial(link = "log"), data = d)
This error is returned due to your model producing values > 1:
PIRLS step-halvings failed to reduce deviance in pwrssUpdate
...
When using lme4 to fit GLMMs with link functions that do not automatically constrain the response to the allowable range of the distributional family (e.g. binomial models with a log link, where the estimated probability can be >1, or inverse-Gamma models, where the estimated mean can be negative), it is not unusual to get this error. This occurs because lme4 doesn’t do anything to constrain the predicted values, so NaN values pop up, which aren’t handled gracefully. If possible, switch to a link function to one that constrains the response (e.g. logit link for binomial or log link for Gamma).
Unfortunately, the suggested workaround is to use a different link function.
The following paper surveys a number of alternative model choices for calculation for [adjusted] relative risk:
Model choices to obtain adjusted risk difference estimates from a binomial regression model with convergence problems: An assessment of methods of adjusted risk difference estimation (2016)

How to Fit Conway–Maxwell-Poisson regression in R?

I want to fit Conway–Maxwell-Poisson regression with one response and two randomly generated covariates in R, How I Fit?
library(COMPoissonReg)
n1=200
x1 = rnorm(n1,0,1)
x2 = rnorm(n1,0,1)
b0=0.05; b1=0.0025;b2=0.005;b7=0.0001
y=b0+(b1*x1)+(b2*x2)
y1=exp(y)
nu=exp(y)
y2=rcmp(n1, y1,nu)
model = glm.cmp(y1 ~ x1+x2,formula.nu=x1+x2)
I found the following error
Error in formula.default(object, env = baseenv()) : invalid formula
so guide me I Fit this model?
For your data above, you did not simulate the overdispersion parameter so you can don't specify the formula for nu. This will work, but your dependent variable is not an integer, so it throws an error:
glm.cmp(y1 ~ x1+x2)
You can simulate data like this, below comes with an over dispersion so you can see that in the estimate for nu :
n1=200
x1 = rnorm(n1,0,1)
x2 = rnorm(n1,0,1)
b0=0.05; b1=3;b2=2
mu=exp(b0+(b1*x1)+(b2*x2))
y = rnbinom(length(mu),mu=mu,size=1)
glm.cmp(y ~ x1+x2,data=df)

Getting estimated means after multiple imputation using the mitml, nlme & geepack R packages

I'm running multilevel multiple imputation through the package mitml (using the panimpute() function) and am fitting linear mixed models and marginal models through the packages nlme and geepack and the mitml:with() function.
I can get the estimates, p-values etc for those through the testEstimates() function but I'm also looking to get estimated means across my model predictors. I've tried the emmeans package, which I normally use for getting estimated means when running nlme & geepack without multiple imputation but doing so emmeans tell me "Can't handle an object of class “mitml.result”".
I'm wondering is there a way to get pooled estimated means from the multiple imputation analyses I've run?
The data frames I'm analyzing are longitudinal/repeated measures and in long format. In the linear mixed model I want to get the estimated means for a 2x2 interaction effect and in the marginal model I'm trying to get estimated means for the 6 levels of 'time' variable. The outcome in all models is continuous.
Here's my code
# mixed model
fml <- Dep + time ~ 1 + (1|id)
imp <- panImpute(data=Data, formula=fml, n.burn=50000, n.iter=5000, m=100, group = "treatment")
summary(imp)
plot(imp, trace="all")
implist <- mitmlComplete(imp, "all", force.list = TRUE)
fit <- with(implist, lme(Dep ~ time*treatment, random = ~ 1|id, method = "ML", na.action = na.exclude, control = list(opt = "optim")))
testEstimates(fit, var.comp = TRUE)
confint.mitml.testEstimates(testEstimates(fit, var.comp = TRUE))
# marginal model
fml <- Dep + time ~ 1 + (1|id)
imp <- panImpute(data=Data, formula=fml, n.burn=50000, n.iter=5000, m=100)
summary(imp)
plot(imp, trace="all")
implist <- mitmlComplete(imp, "all", force.list = TRUE)
fit <- with(implist, geeglm(Dep ~ time, id = id, corstr ="unstructured"))
testEstimates(fit, var.comp = TRUE)
confint.mitml.testEstimates(testEstimates(fit, var.comp = TRUE))
is there a way to get pooled estimated means from the multiple imputation analyses I've run?
This is not a reprex without Data, so I can't verify this works for you. But emmeans provides support for mira-class (lists of) models in the mice package. So if you fit your model in with() using the mids rather than mitml.list class object, then you can use that to obtain marginal means of your outcome (and any contrasts or pairwise comparisons afterward).
Using example data found here, which uncomfortably loads an external workspace:
con <- url("https://www.gerkovink.com/mimp/popular.RData")
load(con)
## imputation
library(mice)
ini <- mice(popNCR, maxit = 0)
meth <- ini$meth
meth[c(3, 5, 6, 7)] <- "norm"
pred <- ini$pred
pred[, "pupil"] <- 0
imp <- mice(popNCR, meth = meth, pred = pred, print = FALSE)
## analysis
library(lme4) # fit multilevel model
mod <- with(imp, lmer(popular ~ sex + (1|class)))
library(emmeans) # obtain pooled estimates of means
(em <- emmeans(mod, specs = ~ sex) )
pairs(em) # test comparison

Resources