converting the glmer output from logit to response scale - r

I hoping get some can help solving this mystery in my mind. The 7th coefficient in my glmer() call is 0.28779305 on the logit scale.
This coefficient can be also obtained by using contrast() in the emmeans package. However, this package apparently outputs the 7th coefficient on a different scale, the response scale.
I wonder how to convert the estimate given by the contrast() call so it matches the 7th coefficient in my glmer() call?
ps. This answer provides some insight but I don't see a way the coefficient from these two packages might be related.
library(lme4)
library(emmeans)
library(broom.mixed)
dat <- read.csv("https://raw.githubusercontent.com/fpqq/w/main/d.csv")
form2 <- y ~ item_type*time + (1 | user_id)
m2 <- glmer(form2, family = binomial, data = dat,
control =
glmerControl(optimizer = "bobyqa"))
coef(summary(m2))[7,]
# Estimate Std. Error z value Pr(>|z|) # This Estimate
#0.28779305 0.11271202 2.55334842 0.01066927 # is on logit scale
#------------------------------------------------------------------
EMM <- emmeans(m2, ~ item_type * time)
CON <- list(c1 = c(1, 0, -1, 0, -1, 0, 1, 0))
contrast(regrid(EMM), CON)
# contrast estimate SE df z.ratio p.value # This Estimate
# c1 0.106 0.0299 Inf 3.526 0.0004 # is on response scale

Following on from #RussLenth's answer:
contrast(EMM, CON) gives you log(o1) - log(o3) - log(o5) + log(o7), and it is indeed equal to the 7th coefficient value.
contrast(REMM, CON) (i.e. regridded) gives you p1 - p3 - p5 - p7.
(where oi = odds in group i, pi = probability in group i, oi = pi/(1-pi)).
While it is possible to convert an individual log-odds value to a probability (if loi is the log-odds value, then pi = 1/(1+exp(-loi))), I don't think there's any way to convert a linear combination of log-odds values directly to the corresponding linear combination of probability values; instead, you'd have to do what emmeans is doing anyway — i.e., convert the individual log-odds values to the probability scale and then compute the linear combination. (In fact emmeans is going the other direction — computing the probabilities from the log-odds when you specify regridding.)
Your question also reveals a misunderstanding of what's going on here: you say
apparently outputs the 7th coefficient on a different scale, the response scale
The package does not output the 7th coefficient on the response scale; instead, it applies the same contrast that would give the 7th coefficient on the log scale to the group values on the probability scale. "[T]he 7th coefficient on the response scale" would be
L(log(o1) - log(o3) - log(o5) + log(o7))
(where L(x) is the logistic function 1/(1+exp(-x)) [plogis() in R]). This is not the same as
L(log(o1)) - L(log(o3)) - L(log(o5)) + L(log(o7))
which is what emmeans gives you.

The bottom line is that if you do
contrast(EMM, CON)
then maybe you'll get that coefficient you are asking about. That's assuming that the 7th coefficient is correctly identified.
The EMM object contains information about estimates of 8 different probabilities (call them p1, p2, ..., p8), on the logit scale.
If you do summary(EMM), you will get estimates of logit(p1), logit(p2), ..., logit(p8).
If you do summary(EMM, type = "response"), you will get estimates of p1, p2, ..., p8
If you do contrast(EMM, CON) you will get an estimate of logit(p1) - logit(p3) - logit(p5) + logit(p7) = log(o1) - log(o3) - log(o5) + log(o7), where oj = pj / (1 - pj) is the odds for the jth case.
If you do contrast(EMM, CON, type = "response"), you get an estimate of exp(log(o1) - log(o3) - log(o5) + log(o7)) = (o1*o7) / (o3*o5)
Now, as documented, REMM = regrid(EMM) undoes the logit transformation once and for all. It preserves no memory of where it came from, it just has information about estimates on the response scale and their covariance matrix. Thus
If you do summary(REMM), you get estimates of p1, p2, ..., p8
If you do summary(REMM, type = "response"), you get estimates of p1, p2, ..., p8. There is no transformation information in REMM, it is already on the response scale.
If you do contrast(REMM, CON) (or contrast(REMM, CON, type = "response")), you get an estimate of p1 - p3 - p5 + p7.
The emmeans package documents all of this, and in addition contains several vignettes with examples. In particular, the one on transformations and the one on comparisons and contrasts are especially pertinent here.

Related

Interpretation of output and prediction from cpglmm

I am helping a colleague fit a Compound-Poisson Generalized Linear Mixed Model in R, using the cpglmm-function from the cplm-package (link). The model involves a three-way interaction and I would like to compute some interpretable quantities. So far, I have tried to calculate some Odds-ratios but I am not sure this is the right way to do it?
# Fit model with three-way interaction in fixed effects #
m <- cpglmm(ncs ~ diversity_index*diversity_speciality*n_authors + selfcit +
n_refs + (1|region), data = diversity)
# Calculate Odds-ratio #
se <- sqrt(diag(vcov(m)))
tab <- cbind(Est = m$fixef,
S.E. = se,
LL = m$fixef - 1.96 * se,
UL = m$fixef + 1.96 * se)
print(exp(tab), digits=3)
I also want to compute some predicted values, e.g predicted probabilities or the like, but I can't get predict() to work for the cpglmm. Is there any functions I could use?

How to find the value of a covariate specific to .5 probability on Logistical Regression

So, I have a binomial glm function, with two predictors, the second (as factor) being with two levels (50, 250).
model <- glm(GetResp.RESP ~ speed + PadLen, family = binomial(link = "logit"), data = myData)
The plot for it looks like this:
My question: How can I find the covariate (ball speed) specific to the .5 probability for each level of the second predictor?
For example, I've tried using the function dose.p(), from the package 'MASS':
dose.p(model, p = 0.5)
and I get
p = 0.5: 36.9868
which, by just looking at the plot, it would be the value for the first (50) level. Now, how can I find it for the second (250) level as well?
Thank you.
dput(myData):
https://pastebin.com/7QYXwwa4
Since this is a logistic regression, you're fitting the function:
log(p/(1-p)) = b0 + b1*speed + b2*PadLen
where p is the probability of GetResp.RESP being equal to 1, b0, b1, and b2 are the regression coefficients, and PadLen is a dummy variable equal to zero when myData$PadLen is 50 and equal to 1 when myData$PadLen is 250.
So you can solve for the speed at p = 0.5:
log(1) = b0 + b1*speed + b2*PadLen
b1*speed = log(1) - b0 - b2*PadLen
speed = (log(1) - b0 - b2*PadLen)/b1
Since log(1) = 0, this reduces to:
speed = (-b0 - b2*c(0,1))/b1
Or, putting in the actual coefficient values:
speed = (-coef(model)[1] - coef(model)[3]*c(0,1))/coef(model)[2]
To solve for speed at other probabilities, just keep the log-odds factor in the equation and enter whatever value of p you desire:
speed = (log(p/(1-p)) - coef(model)[1] - coef(model)[3]*c(0,1))/coef(model)[2]

Generating similar estimates of interactions in afex, lsmeans, and lme4 packages

I would like to know if there is a way get the same estimates of an interaction effect in afex & lsmeans packages as in lmer. The toy data below is for two groups with different intercepts and slopes.
set.seed(1234)
A0 <- rnorm(4,2,1)
B0 <- rnorm(4,2+3,1)
A1 <- rnorm(4,6,1)
B1 <- rnorm(4,6+2,1)
A2 <- rnorm(4,10,1)
B2 <- rnorm(4,10+1,1)
A3 <- rnorm(4,14,1)
B3 <- rnorm(4,14+0,1)
score <- c(A0,B0,A1,B1,A2,B2,A3,B3)
id <- factor(rep(1:8,times = 4, length = 32))
time <- factor(rep(0:3, each = 8, length = 32))
timeNum <- as.numeric(rep(0:3, each = 8, length = 32))
group <- factor(rep(c("A","B"), times =2, each = 4, length = 32))
df <- data.frame(id, group, time, timeNum, score)
df
And here is the plot
(ggplot(df, aes(x = time, y = score, group = group)) +
stat_summary(fun.y = "mean", geom = "line", aes(linetype = group)) +
stat_summary(fun.y = "mean", geom = "point", aes(shape = group), size = 3) +
coord_cartesian(ylim = c(0,18)))
When I run a standard lmer on the data looking for an estimate of the difference in change in score over time between groups.
summary(modelLMER <- lmer(score ~ group * timeNum + (timeNum|id), df))
I get an estimate for the group*time interaction of -1.07, which means that the increase in score for a one-unit increase in time is ~1 point less in group B than group A. This estimate matches the preset differences I built into the dataset.
What I would like to know is how to do a similar thing in the afex and lsmeans packages.
library(afex)
library(lsmeans)
First I generated the afex model object
modelLM <- aov_ez(id="id", dv="score", data=df, between="group", within="time",
type=3, return="lm")
Then passed that into the lsmeans function
lsMeansLM <- lsmeans(modelLM, ~rep.meas:group)
My goal is to generate an accurate estimate of the group*time interaction in afex and lsmeans. To do so requires specifying custom contrast matrices based on the split specified in the lsmeans function above.
groupMain = list(c(-1,-1,-1,-1,1,1,1,1)) # group main effect
linTrend = list(c(-3,-1,1,3,-3,-1,1,3)) # linear trend
linXGroup = mapply("*", groupMain, linTrend) # group x linear trend interaction
Then I made a master list
contrasts <- list(groupMain=groupMain, linTrend=linTrend, linXGroup=linXGroup)
Which I passed into the contrast function in lsmeans.
contrast(lsMeansLM, contrasts)
The F and p values in the output match those for the automatic tests for linear trend and for the group difference in linear trend generated from a mixed ANCOVA in SPSS. However the mixed ANCOVA does not generate an estimate.
The estimate of the effect using the procedure above, instead of being approx. -1, like in the lmer (and matching the difference I built into the data) is approx. -10, which is wildly inaccurate.
I assume it has something to do with how I am coding the contrast coefficients. I know if I normalise the coefficients of the groupMain matrix by dividing all coefficients by four that yields an accurate estimate of the main effect of group averaged across all timepoints. But I have no idea how to get an accurate estimate either of linear trend averaged across groups (linTrend), or an accurate estimate of the difference in linear trend across groups (linXGroup).
I am not sure if this question is more suitable for here or Cross Validated. I figured here first because it seems to be software related, but I know there are probably deeper issues involved. Any help would be much appreciated.
The issue here is that timeNum is a numeric predictor. Therefore, the interaction is a comparison of slopes. Note this:
> lstrends(modelLMER, ~group, var = "timeNum")
group timeNum.trend SE df lower.CL upper.CL
A 4.047168 0.229166 6.2 3.490738 4.603598
B 2.977761 0.229166 6.2 2.421331 3.534191
Degrees-of-freedom method: satterthwaite
Confidence level used: 0.95
> pairs(.Last.value)
contrast estimate SE df t.ratio p.value
A - B 1.069407 0.3240897 6.2 3.3 0.0157
There's your 1.07 - the opposite sign because the comparison is in the other direction.
I will further explain that the lsmeans result you describe in the question is a comparison of the two group means, not an interaction contrast. lsmeans uses a reference grid:
> ref.grid(modelLMER)
'ref.grid' object with variables:
group = A, B
timeNum = 1.5
and as you can see, timeNum is being held fixed at its mean of 1.5. The LS means are predictions for each group at timeNum = 1.5 -- often called the adjusted means; and the difference is thus the difference between those two adjusted means.
Regarding the discrepancy claimed in obtaining your linear contrast of about 10.7: The linear contrast coefficients c(-3,-1,1,3) give you a multiple of the slope of the line. To get the slope, you need to divide by sum(c(-3,-1,1,3)^2) -- and also multiply by 2, because the contrast coefficients increment by 2.
Thanks to the invaluable help of #rvl I was able to solve this. Here is the code.
In order to generate the correct contrast matrices we first need to normalise them
(mainMat <- c(-1,-1,-1,-1,1,1,1,1)) # main effects matrix
(trendMat <- c(-3,-1,1,3,-3,-1,1,3) # linear trend contrast coefficients
(nTimePoints <- 4) # number of timePoints
(mainNorm <- 1/nTimePoints)
(nGroups <- 2) # number of between-Ss groups
(trendIncrem <- 2) # the incremental increase of each new trend contrast coefficient
(trendNorm <- trendIncrem/(sum(trendMat^2))) # normalising the trend coefficients
Now we create several contrast matrices in the form of lists. These are normalised using the objects we created above
(groupMain = list(mainMat*mainNorm)) # normalised group main effect
(linTrend = list(trendMat*trendNorm)) # normalised linear trend
(linXGroup = list((mainMat*trendMat)*(nGroups*trendNorm))) # group x linear trend interaction
Now pass those lists of matrices into a master list
contrasts <- list(groupMain=groupMain, linTrend=linTrend, linXGroup=linXGroup)
And pass that master list into the contrasts function in lsmeans
contrast(lsMeansLM, contrasts)
This is the output
contrast estimate SE df t.ratio p.value
c(-0.25, -0.25, -0.25, -0.25, 0.25, 0.25, 0.25, 0.25) 1.927788 0.2230903 6 8.641 0.0001
c(-0.15, -0.05, 0.05, 0.15, -0.15, -0.05, 0.05, 0.15) 3.512465 0.1609290 6 21.826 <.0001
c(0.3, 0.1, -0.1, -0.3, -0.3, -0.1, 0.1, 0.3) -1.069407 0.3218581 6 -3.323 0.0160
How do we check if these are accurate estimates?
Note first that the estimate of the group*time interaction is now approximately the same value as is returned by
summary(modelLMER)
The 'main effect' trend (for want of a better descriptor), which is the rate of change in score across the four time points averaged across both levels of group, is 3.51. If we change the coding of the group factor to simple coding via
contrasts(df$group) <- c(-.5,.5)
and run summary(modelLMER) again, the time estimate will now be 3.51.
Finally for the main effect of group, that is, the difference in score between groups averaged across all time points. We can run
pairs(lsmeans(modelLM,"group"))
And this will be -1.92. Thank you #rvl. A great answer. Using afex and lsmeans we have now forced a mixed ANCOVA that treats the repeated measures variable as categorical to give us estimates of group differences in trend and main effects that match those returned by a mixed-effects model where the repeated measures variable is continuous, and with p- and F-values that match those of SPSS.

How to unscale the coefficients from an lmer()-model fitted with a scaled response

I fitted a model in R with the lmer()-function from the lme4 package. I scaled the dependent variable:
mod <- lmer(scale(Y)
~ X
+ (X | Z),
data = df,
REML = FALSE)
I look at the fixed-effect coefficients with fixef(mod):
> fixef(mod)
(Intercept) X1 X2 X3 X4
0.08577525 -0.16450047 -0.15040043 -0.25380073 0.02350007
It is quite easy to calculate the means by hand from the fixed-effects coefficients. However, I want them to be unscaled and I am unsure how to do this exactly. I am aware that scaling means substracting the mean from every Y and deviding by the standard deviation. But both, mean and standard deviation, were calculated from the original data. Can I simply reverse this process after I fitted an lmer()-model by using the mean and standard deviation of the original data?
Thanks for any help!
Update: The way I presented the model above seems to imply that the dependent variable is scaled by taking the mean over all responses and dividing by the standard deviation of all the responses. Usually, it is done differently. Rather than taking the overall mean and standard deviation the responses are standardized per subject by using the mean and standard deviation of the responses of that subject. (This is odd in an lmer() I think as the random intercept should take care of that... Not to mention the fact that we are talking about calculating means on an ordinal scale...) The problem however stays the same: Once I fitted such a model, is there a clean way to rescale the coefficients of the fitted model?
Updated: generalized to allow for scaling of the response as well as the predictors.
Here's a fairly crude implementation.
If our original (unscaled) regression is
Y = b0 + b1*x1 + b2*x2 ...
Then our scaled regression is
(Y0-mu0)/s0 = b0' + (b1'*(1/s1*(x1-mu1))) + b2'*(1/s2*(x2-mu2))+ ...
This is equivalent to
Y0 = mu0 + s0((b0'-b1'/s1*mu1-b2'/s2*mu2 + ...) + b1'/s1*x1 + b2'/s2*x2 + ...)
So bi = s0*bi'/si for i>0 and
b0 = s0*b0'+mu0-sum(bi*mui)
Implement this:
rescale.coefs <- function(beta,mu,sigma) {
beta2 <- beta ## inherit names etc.
beta2[-1] <- sigma[1]*beta[-1]/sigma[-1]
beta2[1] <- sigma[1]*beta[1]+mu[1]-sum(beta2[-1]*mu[-1])
beta2
}
Try it out for a linear model:
m1 <- lm(Illiteracy~.,as.data.frame(state.x77))
b1 <- coef(m1)
Make a scaled version of the data:
ss <- scale(state.x77)
Scaled coefficients:
m1S <- update(m1,data=as.data.frame(ss))
b1S <- coef(m1S)
Now try out rescaling:
icol <- which(colnames(state.x77)=="Illiteracy")
p.order <- c(icol,(1:ncol(state.x77))[-icol])
m <- colMeans(state.x77)[p.order]
s <- apply(state.x77,2,sd)[p.order]
all.equal(b1,rescale.coefs(b1S,m,s)) ## TRUE
This assumes that both the response and the predictors are scaled.
If you scale only the response and not the predictors, then you should submit (c(mean(response),rep(0,...)) for m and c(sd(response),rep(1,...)) for s (i.e., m and s are the values by which the variables were shifted and scaled).
If you scale only the predictors and not the response, then submit c(0,mean(predictors)) for m and c(1,sd(predictors)) for s.

Calculating R^2 for a nonlinear least squares fit

Suppose I have x values, y values, and expected y values f (from some nonlinear best fit curve).
How can I compute R^2 in R? Note that this function is not a linear model, but a nonlinear least squares (nls) fit, so not an lm fit.
You just use the lm function to fit a linear model:
x = runif(100)
y = runif(100)
spam = summary(lm(x~y))
> spam$r.squared
[1] 0.0008532386
Note that the r squared is not defined for non-linear models, or at least very tricky, quote from R-help:
There is a good reason that an nls model fit in R does not provide
r-squared - r-squared doesn't make sense for a general nls model.
One way of thinking of r-squared is as a comparison of the residual
sum of squares for the fitted model to the residual sum of squares for
a trivial model that consists of a constant only. You cannot
guarantee that this is a comparison of nested models when dealing with
an nls model. If the models aren't nested this comparison is not
terribly meaningful.
So the answer is that you probably don't want to do this in the first
place.
If you want peer-reviewed evidence, see this article for example; it's not that you can't compute the R^2 value, it's just that it may not mean the same thing/have the same desirable properties as in the linear-model case.
Sounds like f are your predicted values. So the distance from them to the actual values devided by n * variance of y
so something like
1-sum((y-f)^2)/(length(y)*var(y))
should give you a quasi rsquared value, so long as your model is reasonably close to a linear model and n is pretty big.
As a direct answer to the question asked (rather than argue that R2/pseudo R2 aren't useful) the nagelkerke function in the rcompanion package will report various pseudo R2 values for nonlinear least square (nls) models as proposed by McFadden, Cox and Snell, and Nagelkerke, e.g.
require(nls)
data(BrendonSmall)
quadplat = function(x, a, b, clx) {
ifelse(x < clx, a + b * x + (-0.5*b/clx) * x * x,
a + b * clx + (-0.5*b/clx) * clx * clx)}
model = nls(Sodium ~ quadplat(Calories, a, b, clx),
data = BrendonSmall,
start = list(a = 519,
b = 0.359,
clx = 2304))
nullfunct = function(x, m){m}
null.model = nls(Sodium ~ nullfunct(Calories, m),
data = BrendonSmall,
start = list(m = 1346))
nagelkerke(model, null=null.model)
The soilphysics package also reports Efron's pseudo R2 and adjusted pseudo R2 value for nls models as 1 - RSS/TSS:
pred <- predict(model)
n <- length(pred)
res <- resid(model)
w <- weights(model)
if (is.null(w)) w <- rep(1, n)
rss <- sum(w * res ^ 2)
resp <- pred + res
center <- weighted.mean(resp, w)
r.df <- summary(model)$df[2]
int.df <- 1
tss <- sum(w * (resp - center)^2)
r.sq <- 1 - rss/tss
adj.r.sq <- 1 - (1 - r.sq) * (n - int.df) / r.df
out <- list(pseudo.R.squared = r.sq,
adj.R.squared = adj.r.sq)
which is also the pseudo R2 as calculated by the accuracy function in the rcompanion package. Basically, this R2 measures how much better your fit becomes compared to if you would just draw a flat horizontal line through them. This can make sense for nls models if your null model is one that allows for an intercept only model. Also for particular other nonlinear models it can make sense. E.g. for a scam model that uses stricly increasing splines (bs="mpi" in the spline term), the fitted model for the worst possible scenario (e.g. where your data was strictly decreasing) would be a flat line, and hence would result in an R2 of zero. Adjusted R2 then also penalize models with higher nrs of fitted parameters. Using the adjusted R2 value would already address a lot of the criticisms of the paper linked above, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892436/ (besides if one swears by using information criteria to do model selection the question becomes which one to use - AIC, BIC, EBIC, AICc, QIC, etc).
Just using
r.sq <- max(cor(y,yfitted),0)^2
adj.r.sq <- 1 - (1 - r.sq) * (n - int.df) / r.df
I think would also make sense if you have normal Gaussian errors - i.e. the correlation between the observed and fitted y (clipped at zero, so that a negative relationship would imply zero predictive power) squared, and then adjusted for the nr of fitted parameters in the adjusted version. If y and yfitted go in the same direction this would be the R2 and adjusted R2 value as reported for a regular linear model. To me this would make perfect sense at least, so I don't agree with outright rejecting the usefulness of pseudo R2 values for nls models as the answer above seems to imply.
For non-normal error structures (e.g. if you were using a GAM with non-normal errors) the McFadden pseudo R2 is defined analogously as
1-residual deviance/null deviance
See here and here for some useful discussion.
Another quasi-R-squared for non-linear models is to square the correlation between the actual y-values and the predicted y-values. For linear models this is the regular R-squared.
As an alternative to this problem I used at several times the following procedure:
compute a fit on data with the nls function
using the resulting model make predictions
Trace (plot...) the data against the values predicted by the model (if the model is good, points should be near the bissectrix).
Compute the R2 of the linear régression.
Best wishes to all. Patrick.
With the modelr package
modelr::rsquare(nls_model, data)
nls_model <- nls(mpg ~ a / wt + b, data = mtcars, start = list(a = 40, b = 4))
modelr::rsquare(nls_model, mtcars)
# 0.794
This gives essentially the same result as the longer way described by Tom from the rcompanion resource.
Longer way with nagelkerke function
nullfunct <- function(x, m){m}
null_model <- nls(mpg ~ nullfunct(wt, m),
data = mtcars,
start = list(m = mean(mtcars$mpg)))
nagelkerke(nls_model, null_model)[2]
# 0.794 or 0.796
Lastly, using predicted values
lm(mpg ~ predict(nls_model), data = mtcars) %>% broom::glance()
# 0.795
Like they say, it's only an approximation.

Resources