I'm fitting a negative binomial regression. I scaled all continuous predictors prior to fitting the model. I need to transform the coefficients of scaled predictors to be able to interpret them on their original scale. Example:
# example dataset
set.seed(1)
dep <- dnbinom(seq(1:150), size = 150, prob = 0.75)
ind.1 <- ifelse(sign(rnorm(150))==-1,0,1)
ind.2 <- rnorm(150, 10, 1.7)
df <- data.frame(dep, ind.1, ind.2)
# scale continuous independent variable
df$ind.2 <- scale(df$ind.2)
# fit model
m1 <- MASS::glm.nb(dep ~ ind.1 + ind.2, data = df)
summz <- summary(m1)
To get the result for ind.1 I take the exponential of the coefficient:
# result for ind.1
exp(summz$coefficients["ind.1","Estimate"])
> [1] 1.276929
Which shows that for every 1 unit increase in ind.1 you'd expect a 1.276929 increase in dep. But what about for ind.2? I gather that as the predictor is scaled the coefficient can be interpreted as the effect an increase of 1 standard deviation of ind.2 has on dep. How to transform this back to original units? This answer says to multiply the coefficient by the sd of the predictor, but how to do this in the case of a logit link? exp(summz$coefficients["ind.2","Estimate"] * sc) doesn't seem to make sense.
Set up data:
set.seed(1)
dep <- dnbinom(seq(1:150), size = 150, prob = 0.75)
ind.1 <- ifelse(sign(rnorm(150))==-1,0,1)
ind.2 <- rnorm(150, 10, 1.7)
df <- data.frame(dep, ind.1, ind.2)
sc <- sd(df$ind.2)
Fit unscaled and scaled models:
m_unsc <- MASS::glm.nb(dep ~ ind.1 + ind.2, data = df)
m_sc <- update(m_unsc, data = transform(df, ind.2 = drop(scale(df$ind.2))))
Compare coefficients:
cbind(coef(m_unsc), coef(m_sc))
[,1] [,2]
(Intercept) -5.50449624 -5.13543854
ind.1 0.24445805 0.24445805
ind.2 0.03662308 0.06366992
Check equivalence (we divide the scaled coefficient by the scaling factor (sc=sd(ind.2)) to get back the unscaled coefficient):
all.equal(coef(m_sc)["ind.2"]/sc, coef(m_unsc)["ind.2"])
The negative binomial model uses a log link, not a logit link, so if you want to back-transform the coefficient to get proportional or "fold" changes per unit of ind2:
exp(coef(m_sc)["ind.2"]/sc)
this gives 1.0373, a 4% change in the response per unit change in ind.2 (you can confirm that it's the same as exponentiating the unscaled coefficient).
Note that 2/3 of the answers in the linked question, including the currently accepted answer, are wrong: you should be dividing the scaled coefficient by the scaling factor, not multiplying ...
Related
Does anyone know how to extract the parameter estimates of random term when using the (1 | …) syntax in a glmer model (including se, t ratio and p value)? I’m only able to access the average variance and std. deviance with the summary function.
Some background: I used cohort and period random terms (both factorized), where period = each survey year, and cohort = 8 birth cohorts. My model empty model looks like this :
glmer(pid ~ age + age2 + (1 | cohort) + (1| period)
There's a bit of a conceptual problem with what you are doing. The random effects do not have the same standing in statistical theory as the fixed effects. You are not really supposed to be making inferences on their estimates since you don't have a random sampling from their overall population. Hence you need to make some unteseted assumptions on their distribution. That said, there are apparently times when you might want to do it but with care that you are not making unsupportable claims. See: https://stats.stackexchange.com/questions/392314/interpretation-of-fixed-effect-coefficients-from-glms-and-glmms .
Dimitris Rizopoulosthen responded to a request for the possibility of getting "an average" of the random effects conditional on the fixed effects (rather the flipped version of mixed models inference). He offered a function in his GLMM package:
https://drizopoulos.github.io/GLMMadaptive/articles/Methods_MixMod.html#marginalized-coefficients
This is his example ......
install.packages("GLMMadaptive"); library(GLMMadaptive)
set.seed(1234)
n <- 100 # number of subjects
K <- 8 # number of measurements per subject
t_max <- 15 # maximum follow-up time
# we constuct a data frame with the design:
# everyone has a baseline measurment, and then measurements at random follow-up times
DF <- data.frame(id = rep(seq_len(n), each = K),
time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))
# design matrices for the fixed and random effects
X <- model.matrix(~ sex * time, data = DF)
Z <- model.matrix(~ time, data = DF)
betas <- c(-2.13, -0.25, 0.24, -0.05) # fixed effects coefficients
D11 <- 0.48 # variance of random intercepts
D22 <- 0.1 # variance of random slopes
# we simulate random effects
b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))
# linear predictor
eta_y <- as.vector(X %*% betas + rowSums(Z * b[DF$id, ]))
# we simulate binary longitudinal data
DF$y <- rbinom(n * K, 1, plogis(eta_y))
#We continue by fitting the mixed effects logistic regression for y assuming random intercepts and random slopes for the random-effects part.
fm <- mixed_model(fixed = y ~ sex * time, random = ~ time | id, data = DF,
family = binomial())
.... and then the call to his marginal_coefs function.
marginal_coefs(fm, std_errors=TRUE)
Estimate Std.Err z-value p-value
(Intercept) -1.6025 0.2906 -5.5154 < 1e-04
sexfemale -1.0975 0.3676 -2.9859 0.0028277
time 0.1766 0.0337 5.2346 < 1e-04
sexfemale:time 0.0508 0.0366 1.3864 0.1656167
library(lme4)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), data = sleepstudy)
To generate a 95% CI, I can use the predictInterval() function from the package merTools.
library(merTools)
head(predictInterval(fm1, level = 0.95, seed = 123, n.sims = 100))
# fit upr lwr
# 1 255.4179 313.8781 184.1400
# 2 273.2944 333.2005 231.3584
# 3 291.8451 342.8701 240.8226
# 4 311.3562 359.2908 250.4980
# 5 330.3671 384.2520 270.7094
# 6 353.4378 409.9307 289.4760
In the documentation, it says about the predictInterval() function
This function provides a way to capture model uncertainty in predictions from multi-level models
fit with lme4. By drawing a sampling distribution for the random and the fixed effects and then
estimating the fitted value across that distribution, it is possible to generate a prediction interval for
fitted values that includes all variation in the model except for variation in the covariance parameters,
theta. This is a much faster alternative than bootstrapping for models fit to medium to large datasets.
My goal is to get all the fitted values instead of the the upper and lower CI i.e. for each row, I need the
original n simulations from which these 95% CI are calculated. I checked the argument in the documentation and
followed this:
head(predictInterval(fm1, n.sims = 100, returnSims = TRUE, seed = 123, level = 0.95))
# fit upr lwr
# 1 255.4179 313.8781 184.1400
# 2 273.2944 333.2005 231.3584
# 3 291.8451 342.8701 240.8226
# 4 311.3562 359.2908 250.4980
# 5 330.3671 384.2520 270.7094
# 6 353.4378 409.9307 289.4760
Instead of getting the 100 simulations, it still gives me the same output. What is it I am doing wrong here?
A second question though I believe this is more of a StatsExchange one.
"By drawing a sampling distribution for the random and the fixed
effects and then."`
How does it draws the sampling distribution if some could explain me?
You can get simulated values if you specify newdata in the predictInterval() function.
predInt <- predictInterval(fm1, newdata = sleepstudy, n.sims = 100,
returnSims = TRUE, seed = 123, level = 0.95)
simValues <- attr(predInt, "sim.results")
Details on how to create sampling distributions of parameters are given in the Detail section of the help page.You can get the estimates of fit, lower and upper boundaries as:
fit <- apply(simValues, 1, function(x){quantile(x, probs=0.500) } )
lwr <- apply(simValues, 1, function(x){quantile(x, probs=0.025) } )
upr <- apply(simValues, 1, function(x){quantile(x, probs=0.975) } )
I am struggling to transform the log odds ratio profile confidence intervals obtained from a logit model into probabilities. I would like to know how to calculate the confidence intervals of the difference between two groups.
If the p-value is > 0.05, the 95% CI of the difference should span from below zero to above zero. However, I don’t know how negative values can be obtained when the log ratios have to be exponentiated. Therefore I tried to calculate the CI of one of the groups (B) and see what the difference of the lower and the upper end of the CI to the estimate of group A is. I believe this is not the correct way to calculate the CI of the difference because the estimate of A is also uncertain.
I would be happy if anyone could help me out.
library(lme4)
# Example data:
set.seed(11)
treatment = c(rep("A",30), rep("B", 40))
site = rep(1:14, each = 5)
presence = c(rbinom(30, 1, 0.6),rbinom(40, 1, 0.8))
df = data.frame(presence, treatment, site)
# Likelihood ratio test
M0 = glmer(presence ~ 1 + (1|site), family = "binomial", data = df)
M1 = glmer(presence ~ treatment + (1|site), family = "binomial", data = df)
anova(M1, M0)
# Calculating confidence intervals
cc <- confint(M1, parm = "beta_")
ctab <- cbind(est = fixef(M1), cc)
cdat = as.data.frame(ctab)
# Function to back-transform to probability (0-1)
unlogit = function(y){
y_retransfromed = exp(y)/(1+exp(y))
y_retransfromed
}
# Getting estimates
A_est = unlogit(cdat$est[1])
B_est = unlogit(cdat$est[1] + cdat$est[2])
B_lwr = unlogit(cdat$est[1] + cdat[2,2])
B_upr = unlogit(cdat$est[1] + cdat[2,3])
Difference_est = B_est - A_est
# This is how I tried to calculate the CI of the difference
Difference_lwr = B_lwr - A_est
Difference_upr = B_upr - A_est
# However, I believe this is wrong because A_est is also “uncertain”
How to get the confidence interval of the difference of the probability of presence?
We can calculate the average treatment effect in the following way. From the original data, create two new datasets, one in which all units receive treatment A, and one in which all units receive treatment B. Now, based on your model estimates (in your case, M1), we compute predicted outcomes for units in each of these two datasets. We then compute the mean difference in the outcomes between the two datasets to get our estimated average treatment effect. Here, we can write a function that takes a glmer object and computes the average treatment effect:
ate <- function(.) {
treat_A <- treat_B <- df
treat_A$treatment <- "A"
treat_B$treatment <- "B"
c("ate" = mean(predict(., newdata = treat_B, type = "response") -
predict(., newdata = treat_A, type = "response")))
}
ate(M1)
# ate
# 0.09478276
How do we get the uncertainty interval? We can use the bootstrap, i.e. re-estimate the model many times using randomly generated samples from your original data, calculating the average treatment effect each time. We can then use the distribution of the bootstrapped average treatment effects to compute our uncertainty interval. Here we generate 100 simulations using the bootMer function
out <- bootMer(M1, ate, seed = 1234, nsim = 100)
and inspect the distribution of the effect:
quantile(out$t, c(0.025, 0.5, 0.975))
# 2.5% 50% 97.5%
# -0.06761338 0.10508751 0.26907504
We know how to plot decision boundaries for logistic regression and other classifier methods, however, I am not interested in a plot; but rather I want the exact value at which the binomial prediction is .50.
For example:
train=data.frame(1:20)
train$response=rep(1:0,10)
model=glm(response ~ poly(X1.20, 2), data=train, family=binomial)
train$X1.20[1]=10.5
predict(model, train[1,], type="response")
Leaves me with a decision boundary of 10.5 which I can find through trial and error with the predict() function, meaning a value of 10.5 for the independent variable gives a response of exactly .50. Is there an automated way to find what value will give a response of .50?
You should use the fact that a predicted value of zero from the logit model implies a response probability of 0.5. So you can just try to find a value of x that makes the predicted value as close to zero as possible. Here deviationFromZero() finds how far the predicted value from the model is from zero given any value of x.
df <- data.frame(x = 1:20, response = rep(1:0, 10))
model <- glm(response ~ poly(x, 2), data = df, family = binomial)
deviationFromZero <- function(y) abs(predict(model, data.frame(x = y)))
boundary <- optimize(f = deviationFromZero, interval = range(df$x))
boundary
$minimum
[1] 10.5
$objective
1
1.926772e-16
I fitted a model in R with the lmer()-function from the lme4 package. I scaled the dependent variable:
mod <- lmer(scale(Y)
~ X
+ (X | Z),
data = df,
REML = FALSE)
I look at the fixed-effect coefficients with fixef(mod):
> fixef(mod)
(Intercept) X1 X2 X3 X4
0.08577525 -0.16450047 -0.15040043 -0.25380073 0.02350007
It is quite easy to calculate the means by hand from the fixed-effects coefficients. However, I want them to be unscaled and I am unsure how to do this exactly. I am aware that scaling means substracting the mean from every Y and deviding by the standard deviation. But both, mean and standard deviation, were calculated from the original data. Can I simply reverse this process after I fitted an lmer()-model by using the mean and standard deviation of the original data?
Thanks for any help!
Update: The way I presented the model above seems to imply that the dependent variable is scaled by taking the mean over all responses and dividing by the standard deviation of all the responses. Usually, it is done differently. Rather than taking the overall mean and standard deviation the responses are standardized per subject by using the mean and standard deviation of the responses of that subject. (This is odd in an lmer() I think as the random intercept should take care of that... Not to mention the fact that we are talking about calculating means on an ordinal scale...) The problem however stays the same: Once I fitted such a model, is there a clean way to rescale the coefficients of the fitted model?
Updated: generalized to allow for scaling of the response as well as the predictors.
Here's a fairly crude implementation.
If our original (unscaled) regression is
Y = b0 + b1*x1 + b2*x2 ...
Then our scaled regression is
(Y0-mu0)/s0 = b0' + (b1'*(1/s1*(x1-mu1))) + b2'*(1/s2*(x2-mu2))+ ...
This is equivalent to
Y0 = mu0 + s0((b0'-b1'/s1*mu1-b2'/s2*mu2 + ...) + b1'/s1*x1 + b2'/s2*x2 + ...)
So bi = s0*bi'/si for i>0 and
b0 = s0*b0'+mu0-sum(bi*mui)
Implement this:
rescale.coefs <- function(beta,mu,sigma) {
beta2 <- beta ## inherit names etc.
beta2[-1] <- sigma[1]*beta[-1]/sigma[-1]
beta2[1] <- sigma[1]*beta[1]+mu[1]-sum(beta2[-1]*mu[-1])
beta2
}
Try it out for a linear model:
m1 <- lm(Illiteracy~.,as.data.frame(state.x77))
b1 <- coef(m1)
Make a scaled version of the data:
ss <- scale(state.x77)
Scaled coefficients:
m1S <- update(m1,data=as.data.frame(ss))
b1S <- coef(m1S)
Now try out rescaling:
icol <- which(colnames(state.x77)=="Illiteracy")
p.order <- c(icol,(1:ncol(state.x77))[-icol])
m <- colMeans(state.x77)[p.order]
s <- apply(state.x77,2,sd)[p.order]
all.equal(b1,rescale.coefs(b1S,m,s)) ## TRUE
This assumes that both the response and the predictors are scaled.
If you scale only the response and not the predictors, then you should submit (c(mean(response),rep(0,...)) for m and c(sd(response),rep(1,...)) for s (i.e., m and s are the values by which the variables were shifted and scaled).
If you scale only the predictors and not the response, then submit c(0,mean(predictors)) for m and c(1,sd(predictors)) for s.