removing covariates from a linear mixed model using update - r

I'm newish to R. I have a linear mixed model with several predictors and I want to test the significance of each of them. I know that I could use lmerTest but my co-authors want me to do a likelihood ratio test for each predictor instead. I would like to use the update function to get a series of submodels that omit each predictor in turn. I tried the following
data(mtcars)
h=lmer(mpg ~ 1 + cyl + disp + hp + drat + (1|carb), data=mtcars)
predvars=c("cyl","disp","hp","drat")
for (i in predvars){
modelform=update(as.formula(paste0("h, . ~ . -",i)))
print(summary(modelform))
}
I got the following error
Error in parse(text = x, keep.source = FALSE) :
:1:2: unexpected ','
1: h,
^
I also tried using lapply
Fits=lapply(predvars, function(x) {update(h, .~.-i, list(i=as.name(x)))})
names(Fits)=predvars
which doesn't actually update the model, it just refits the full model i times. What am I doing wrong? Thanks.

Your first attempt generates an error because you put h inside as.formula. Do:
modelform <- update(h, as.formula(paste0(". ~ . -",i)))

Related

fixest::feols and ggeffects::ggeffect not working together in R

I'm having a hard time getting a fixest object to play nicely with ggeffects in R, when fixed effects are included.
When I run the following code:
m <- feols(mpg ~ disp + gear + hp | cyl, mtcars,
cluster = c("am", "cyl"))
summary(m)
marg1 <- ggeffect(m, terms = c("disp"))
I get an error reading:
Can't compute marginal effects, 'effects::Effect()' returned an error.
Reason: non-conformable arguments
You may try 'ggpredict()' or 'ggemmeans()'.
However, there are no problems when I remove the fixed effects term / include it without using the pipe:
m <- feols(mpg ~ disp + gear + hp + cyl, mtcars,
cluster = c("am", "cyl"))
summary(m)
marg1 <- ggeffect(m, terms = c("disp"))
ggpredict also returns an error on my data (Could not compute variance-covariance matrix of predictions. No confidence intervals are returned.) but I am unable to replicate that same error using the toy data.

Error when using regr() command: undefined columns selected

I get the following error when trying to run the regr() command from the yhat package:
Error in `[.data.frame`(new.data, , c(DV, IVx)) :
undefined columns selected
Here is the code I'm using:
DEregr_model <- lm(TotalBiomass ~ propnC + propnV + propnR + I(propnC^2) + I(propnV^2) + propnC:propnV + propnV:propnR + propnV:I(propnC^2), DE_model)
DEregrout <- regr(DEregr_model)
Why is this function returning an error?
I think I can demonstrate my suspicion expressed in the comments with this MCVE:
> lm.gas <- lm( mpg ~ hp + disp +hp:I(disp^2), data= mtcars)
> lm.gas
Call:
lm(formula = mpg ~ hp + disp + hp:I(disp^2), data = mtcars)
Coefficients:
(Intercept) hp disp hp:I(disp^2)
3.562e+01 -4.168e-02 -5.879e-02 3.151e-07
> install.packages("yhat")
also installing the dependency ‘yacca’
> library(yhat)
> regr(lm.gas)
Error in `[.data.frame`(new.data, , c(DV, IVx)) :
undefined columns selected
In addition: Warning message:
In regr(lm.gas) : NAs introduced by coercion
I suspect that the I(.) terms are not being saved in the result of the lm call in a manner that the regr function is able to handle.
The work around would be to calculate the values of the squared variables with separate names in an augmented dataset.
Based on the comments, I figured out the issue. The interaction terms (i.e., I(propnV^2)) weren't being read correctly by the function. So I added additional columns in my data frame with the squared values, so that the model was reading these terms as individual values, not trying to separate them. Corrected code is below:
## make new columns for interaction effect of seeding rate propn
DE$propnC2 <- DE$propnC^2
DE$propnV2 <- DE$propnV^2
DE$propnR2 <- DE$propnR^2
## run lm model with adjusted terms
DEregr_model <- lm(TotalBiomass ~ propnC + propnV + propnR + propnC2 + propnV2 + propnC:propnV + propnV:propnR + propnV:propnC2, DE_model)
DEregrout <- regr(DEregr_model)
The regr() function now runs without error, thanks everyone for your input!

Margins Package error using quadratic and interaction terms

I have code which uses the margins command in Stata and I am trying to replicate it in R using the "margins" package found here and on cran.
I keep getting the error:
marg1<-margins(reg2)
Error in names(classes) <- clean_terms(names(classes)) : 'names' attribute [18] must be the same length as the vector [16]"
A minimum reproducible example is show below:
install.packages(margins)
library(margins)
mod1 <- lm(log(mpg) ~ vs + cyl + hp + vs*hp + I(vs*hp*hp) + wt + I(hp*hp), data = mtcars)
(marg1 <- margins(mod1))
summary(marg1)
I need vs to be a dummy variable interacted with both a quadratic term and a normal interaction.
Does anyone know what I am doing wrong or if there is a way around this?
Your model specification is a bit confusing. For example, vs*hp introduces 3 variables: i) vs, ii) hp and iii) interaction vs and hp. As a result, hp appears twice in the formula you provided. You can simplify massively! Try this for example (I think it is what you want):
mtcars$hp2 = mtcars$hp^2
mod1 <- lm(log(mpg) ~ cyl + wt + vs*hp + vs*hp2, data = mtcars)
summary(mod1) # With this you can check that the model you specified is what you want
(marg1 <- margins(mod1)) # The error disappeared.
summary(marg1)
In general, I would recommend you to avoid I() in formula specifications, as it often gives rise to such errors when not treated with enough care (though sometimes one cannot avoid it). Good luck!

make R report adjusted R squared and F-test in output with robust standard errors

I have estimated a linear regression model using lm(x~y1 + y1 + ... + yn) and to counter the present heteroscedasticity I had R estimate the robust standard errors with
coeftest(model, vcov = vcovHC(model, type = "HC0"))
I know that (robust) R squared and F statistic from the "normal" model are still valid, but how do I get R to report them in the output? I want to fuse several regression output from different specifications together with stargazer and it would become very chaotic if I had to enter the non-robust model along just to get these statistics. Ideally I want to enter a regression output into stargazer that contains these statistics, thus importing it to their framework.
Thanks in advance for all answers
I don't have a solution with stargarzer, but I do have a couple of viable alternatives for regression tables with robust standard errors:
Option 1
Use the modelsummary package to make your tables.
it has a statistic_override argument which allows you to supply a function that calculates a robust variance covariance matrix (e.g., sandwich::vcovHC.
library(modelsummary)
library(sandwich)
mod1 <- lm(drat ~ mpg, mtcars)
mod2 <- lm(drat ~ mpg + vs, mtcars)
mod3 <- lm(drat ~ mpg + vs + hp, mtcars)
models <- list(mod1, mod2, mod3)
modelsummary(models, statistic_override = vcovHC)
Note 1: The screenshot above is from an HTML table, but the modelsummary package can also save Word, LaTeX or markdown tables.
Note 2: I am the author of this package, so please treat this as a potentially biased view.
Option 2
Use the estimatr::lm_robust function, which automatically includes robust standard errors. I believe that estimatr is supported by stargazer, but I know that it is supported by modelsummary.
library(estimatr)
mod1 <- lm_robust(drat ~ mpg, mtcars)
mod2 <- lm_robust(drat ~ mpg + vs, mtcars)
mod3 <- lm_robust(drat ~ mpg + vs + hp, mtcars)
models <- list(mod1, mod2, mod3)
modelsummary(models)
This is how to go about it. You need to use model object that is supported by stargazer as a template and then you can provide a list with standard errors to be used:
library(dplyr)
library(lmtest)
library(stargazer)
# Basic Model ---------------------------------------------------------------------------------
model1 <- lm(hp ~ factor(gear) + qsec + cyl + factor(am), data = mtcars)
summary(model1)
# Robust standard Errors ----------------------------------------------------------------------
model_robust <- coeftest(model1, vcov = vcovHC(model1, type = "HC0"))
# Get robust standard Errors (sqrt of diagonal element of variance-covariance matrix)
se = vcovHC(model1, type = "HC0") %>% diag() %>% sqrt()
stargazer(model1, model1,
se = list(NULL, se), type = 'text')
Using this approach you can use stargazer even for model objects that are not supported. You only need coefficients, standard errors and p-values as vectors. Then you can 'mechanically insert' even unsupported models.
One last Note. You are correct that once heteroskedasticity is present, Rsquared can still be used. However, overall F-test as well as t-tests are NOT valid anymore.

Estimating Coefficients of Multiple Regression

I have the following problem:
I'm using the data "Railtrail" from this library "mosaicData".
I already have the coeffiecient of this following linear regression model :
lm(volume ~ hightemp + cloudcover + weekday, data = RailTrail) , compute for the population.
Now, I need to estimate the coeffiecient of that model with samples and to build a confidence interval (95%).
So I need to compute all the coefficients of the data samples previously generated. I was asked to use a loop 'for' but I don't know how to compute the LR models. I also need to store the coefficient obtained.
I tried to do it doing this
trial <- list()
set.seed(101)
for(i in 1:100){
trial[[i]] <- RailTrail %>%
lm(volume ~ hightemp + cloudcover + weekday, data = RailTrail)
}
but I get the following error:
Error in xj[i] : invalid subscript type 'language'
Thank you,
Don't hesitate to ask further precision if my request is not clear.
Francisco
Do you mean the confidence interval for the model parameters? If so, this example I hope illustrates a succinct way of doing so:
model <- lm(mpg ~ cyl + disp + gear, data = mtcars)
bind_cols(broom::tidy(model), broom::confint_tidy(model))

Resources