plot() does not show all diagnostic plots for lme/lmer - r

When using the lme/lmer function, I cannot get R to display all 4 diagnostic plots (res vs fit, normal-QQ, scale-location, res vs leverage) with par(mfrow=c(2,2)) and plot().
I just get the res vs fit plot and nothing else.
I have no problem when using the lm function.
Does anybody know how to do this?
library(lme4)
m0<-lmer(hematology~Treatment*day+Gender+(1|ID),data=long,na.action=na.omit,REML=FALSE)
par(mfrow=c(2,2))
plot(m0)

tl;dr ?plot.merMod explains in quite a bit of detail how the plotting methods work for fits produced by [g]lmer ...
You can get at least the first three plots corresponding to plot.lm fairly easily:
fitted vs residual with smooth line added
plot(lmer_model, type=c("p","smooth"), col.line=1)
(it's harder to get the smooth and the zero line drawn in different colours)
scale-location plot
plot(lmer_model,
sqrt(abs(resid(.)))~fitted(.),
type=c("p","smooth"), col.line=1)
Q-Q plot
lattice::qqmath(lmer_model)
residuals vs leverage
plot(fm1, rstudent(.) ~ hatvalues(.))
(the Cook's distances can be computed via cooks.distance() but superimposing the contours of CD={0.5,1} isn't so easy ...)
historical note
The design and implementation of lme4 diagnostic plot methods differ from plot.lm, which is the canonical example in base R. Why? I don't know for sure, but this approach is derived from the nlme package, which predates R; the earliest version I could find is this page from the Wayback Machine (1998), which links to a copy of the user's guide for version 1.2, dated February 1995; that's three months before the first source-code release of R (via ftp) in June 1995.
it uses lattice (derived from Trellis™ graphics) rather than base-R graphics
although it doesn't automatically construct e.g. scale-location plots, it is more flexible. You can use formulas to show fitted or residual values vs parameters, facet, etc., e.g. plot(fm1,residuals(.)~Days|Subject)
there are separate commands for plotting residuals etc. (plot) and Q-Q plots (qqnorm in nlme, qqmath in lme4)

I know that this is a 2-year-old question, but I was having the same issue (September/2022) and then I found Panel of Diagnostic Residual Plots and redres
resid_panel(mod1, smoother = TRUE, qqbands = TRUE)
which shows:

As long as we're adding answers, the performance package is now available:
library(lme4)
library(performance)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
check_model(fm1)

In R, plot is a generic function. This means that when you call plot, R examines the class of the object you have passed to the first argument and chooses the plotting method according to this class.
Let's take an example. Suppose I use the lm function to create a model. The resulting model object will have class "lm":
lm_model <- lm(Sepal.Length ~ Sepal.Width, data = iris)
class(lm_model)
#> [1] "lm"
That means that when I call plot(lm_model), R will see that I am calling plot on an object of class lm. Instead of trying to construct a basic xy plot as it would if I did plot(1:10), R now knows to call a plotting method that has been specifically written to plot objects of type "lm". In this case, it will dispatch the method stats:::plot.lm, which is a long function that takes the "lm" object and creates the 4 diagnostic plots.
Now let's see what we get when we create a model with lmer:
library(lme4)
lmer_model <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
class(lmer_model)
#> [1] "lmerMod"
#> attr(,"package")
#> [1] "lme4"
Our model is an object of type "lmerMod". When we call plot on this object, R looks up the correct method to plot an object of this class. Since it has a completely different structure from an object of class "lm", it wouldn't make sense to plot it with plot.lm, so the authors who created the lme4 package had to decide what the best way to plot an object of class "lmerMod" was. They wrote the method lme4:::plot.merMod, which draws the single plot you see when you call plot on your model.
Why is this? That's one for the authors to answer, but it seems the main reason is that they wanted a plot method that would cover GLMM, LMM and REML models. The diagnostic plots for lm don't make sense for all of these model types.
So the short answer is that there is no problem to "solve" as such; this is just not how "lmerMod" objects are plotted. If you have specific concerns about some aspects of your fit that can be answered by these diagnostic plots, you should examine these individually.

Related

Extract values used to make plot for parametric component of GAM in R

I have performed a GAM that includes both continuous smooth terms and a categorical variable. I have plotted the model (mod) using plot(mod,residuals=T,all.terms=T,pages=1). This produces plots of the two smooth parameters as well as the parametric parameter. I want to extract the values used to make these plots so I can re do them and make them look nicer. If I save the plot in an object, this gives me everything I need for the smooth terms, but doesn't contain any information about the parametric component: plot.mod=plot(mod,residuals=T,all.terms=T,select=0). But I can't see where the numbers are coming from for the default plotting of the parametric component. Is there a way to extract these as well?
Here is a reproducible example of what I have done so far
library(mgcv)
# create some data
data=data.frame(response=c(10,12,8,9,3,4,5,5,4,5,4,5,4,1),pred1=c(9,8,8,9,6,7,6,4,3,4,2,3,3,1),pred2=as.factor(c("A","C","B","B","A","A","C","B","C","A","C","B","A","B")),pred3=c(1,6,3,4,8,6,4,5,7,10,11,3,12,1))
# run the GAM
mod <- gam(response ~ s(pred1,k=8) + pred2 + s(pred3,k=5), data=data, family=gaussian(), method="REML")
# the default plot
plot(mod,residuals=T,all.terms=T,pages=1)
# save values in an object. But this only saves the smooth terms.
plot.mod=plot(mod,residuals=T,all.terms=T,select=0)
# How can I extract the values used to plot the parametric term?
The plot I'm trying to extract the data to make:
From the plot.gam documentation, termplot is used for the parametric terms, so
plot.para <- termplot(mod, se = TRUE, plot = FALSE)
saves that plot to a list.
The format is different than the others, but the data is there.

Residual Plot for multivariate regression in Time Series, with time on X axis in R

I have a dataframe which is a time series. I am using the function lm to build a multivariate regression model.
linearmodel <- lm(Y~X1+X2+X3, data = data)
I want to plot the residuals of this linearmodel on the y-axis and time on the x-axis using a simple function, with the lm() object as the input.
Standard residual plotting functions like the one in car package (car::residualPlot) gives residuals on the Y-axis and fitted-values on the Y-axis.
Ideally, I need the residuals on the Y-axis and the timescale on the X-axis. But I understand that the function lm() is time agnositc. So, I can live with if the residuals are on Y-axis in the same order as the data input and nothing on the X-axis
Is there a plotting function which i can use by passing the linearmodel object into the function (not something where i can extract the residuals and use ggplot2). So for example: plot<- plotresidualsinorder(linearmodels) should give me the residuals on Y-axis in the same order of the data input?
I want to use this plot in R-shiny ultimately.
My research led me to car package, which is wonderful in its own right, but doesn't have the function to solve my problem.
Many thanks in advance for the help.
You can use the Residual Plot information. For the proposed solution, we need to apply the lm function to a formula that describes your Y variables by the variables X1+X2+X3, and save the linear regression model in a new linearmodel variable. Finally, we compute the residual with the resid function. In your case, the following solution can be representative for your problem.
Proposed solution:
linearmodel <- lm(Y~X1+X2+X3, data = data)
lm_resid <- resid(linearmodel)
plot(data$X1+X2+X3, lm_resid,
ylab="Residuals", xlab="Time",
main="Data")
abline(0, 0)
For any help concerning how does the resid function works, you can try:
help(resid)
Calisto's solution will work, but there is a more simple and straightforward solution. The lm function already give to you the regression residuals. So you may simply pass:
plot(XTime, linearmodel$residuals, main = "Residuals")
XTime is the Date variable of your dataset, maybe you may require to format that with POSIX functions: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/as.POSIX*
Add parameters as you need to share it on R-shiny.

Obtain residual diagnostic plots from rms package ols() function

How do you obtain residual diagnostic plots from an ols() object? normally if using glm() or lm(), I'd just do plot(lm()), but plot(ols()) gives an error.
My code is:
fit <- ols(y ~ rcs(x1,4)*x2, data=data, x=TRUE, y=TRUE)
plot(fit)
The error message I receive is
Error in match.arg(type) :
'arg' should be one of “ordinary”, “score”, “dfbeta”, “dfbetas”, “dffit”, “dffits”, “hat”, “hscore”
Laboriously (but flexibly), you need to compute residuals and estimates (using resid() and fitted()) and bind them into your data frame, then use plotting package like ggplot2 or lattice to create the plots yourself. Harrell gives examples at bottom of p. 153 of the 2nd edition of his book that describes the use of this package in detail.
As a quick and dirty alternative, you can fit a version using conventional functions (e.g. lm()) and plot() will return the usual diagnostic plots. Things like rcs() will work with many of the base fitting functions.
The ols object does inherit from the lm object,
class(rcsLogFit)
## [1] "ols" "rms" "lm"
but I was unable to gets stats:::lm() to work without the error noted here.
A quick workaround is to make a copy that forgets its rms-specific classes:
rcsLogCopy <- rcsLogFit
class(rcsLogCopy) <- "lm"
and then
plot(rcsLogCopy)
works fine. I suppose you could do this directly on the original instead.

How to Adjust restricted cubic spline cox model using rms package?

I am trying to plot a restricted cubic spline model using the rms package. However I don't find any way to adjust my cox proportional hazard model, I can only get the unadjusted fit.
Here is my code:
library(survival)
library(rms)
dd <- datadist(Cox9)
options(datadist="dd")
fit <- cph(Surv(follcox,evento) ~ rcs(G_VINO,3))
plot(Predict(fit_vino), lty=1, lwd=3, ylim=c(-0.5,1.0),xlim = c(0,50), col="white")
With this coding I get the unadjusted spline model.
I wondered how can I add the confounding variables to adjust the model.
I tried:
fit_vino_adj <- cph(Surv(follcox,evento) ~rcs(G_VINO+edad0+actfis+energia))
plot(Predict(fit_vino_adj), lty=2, lwd=2)
But that gives me the splines model of each variable separately, anyone has an idea how can I adjust my model?
Since you failed to include the data in Cox9 or show how one might construct a similar dataframe or show any output, we can only guess at what happened and respond in generalities. It appears that you are bundling the variables within the rcs function. That is unlikely to succeed, or if it does succeed seems likely that the results will be incorrect. Instead you should construct this fit and then plot only the adjusted fit of the curve you are interested in by naming the variable of focus in the Predict-call.
fit_vino_adj <- cph(Surv(follcox,evento) ~ rcs(G_VINO, 3)+edad0+actfis+energia)
plot(Predict(fit_vino_adj, name="G_VINO"), lty=2, lwd=2)
Or perhaps (assuming these are all continuous measurements) make the very slightly modified plotting call after:
fit_vino_adj2 <- cph(Surv(follcox,evento) ~ rcs(G_VINO, 3)+rcs(edad0, 3) +
rcs(actfis, 3) + rcs(energia, 3) )
plot(Predict(fit_vino_adj), lty=2, lwd=2) # to see form of all variable fits.
If you want to have two or more rcs splines in the models, then you need to wrap rcs around the other variables separately. I did not think that rcs function was not like the ^ function, which has a formula expansion method. (Although your claim that you got separate output from that second model makes me wonder if I have completely kept up with that package.) If you wanted a complex surface for what I call "crossed-splines", then you would use the * operator between two rcs calls. Crossing with a factor variable will construct individual rcs-spline fits for each level of the factor.

Creating a Function from svyglm

I fit the following glm model using the survey package:
design <- svydesign(ids=training.data$name, design=design,family=quasibinomial(), data=training.data)
significant.model <- svyglm(Win~x+ y + start+ speed+ vx0 + vy0 + ay + az + length+ rate+ height+ hand+ zone+ count, design=design, family=quasibinomial, data=training.data)
I have a set of test data that I excluded from the model fitting process so that I would be able to see how the model predicts the outcomes for the test data and examine the difference.
Typically, I would use makeFun in the mosaic package, but this does not support objects of type svyglm. Is there another function or method that I can use to create a function for the model?
There are a lot of categorical variables with multiple levels, so writing a user-defined function is not ideal in this situation.
I'm not sure what difficulty you were experiencing since your example is not reproducible. But since an svyglm object is a glm object, makeFun() will create a wrapper around predict() just as it would do for any glm object. This has not been tested extensively, but it seems to work in the following example:
r
example(svyglm)
f <- makeFun(api.reg)
f(enroll = 500)

Resources