How to update summary when using NeweyWest? - r

I am using NeweyWest standard errors to correct my lm() / dynlm() output. E.g.:
fit1<-dynlm(depvar~covariate1+covariate2)
coeftest(fit1,vcov=NeweyWest)
Coefficients are displayed the way I´d like to, but unfortunately I loose all the regression output information like R squared, F-Test etc. that is displayed by summary. So I wonder how I can display robust se and all the other stuff in the same summary output.
Is there a way to either get everything in one call or to overwrite the 'old' estimates?
I bet I just missed something badly, but that is really relevant when sweaving the output.
Test example, taken from ?dynlm.
require(dynlm)
require(sandwich)
data("UKDriverDeaths", package = "datasets")
uk <- log10(UKDriverDeaths)
dfm <- dynlm(uk ~ L(uk, 1) + L(uk, 12))
#shows R-squared, etc.
summary(dfm)
#no such information
coeftest(dfm, vcov = NeweyWest)
btw.: same applies for vcovHC

coefficients is just a matrix in the lm (or dynlm) summary object, so all you need to do is unclass the coeftest() output.
library(dynlm)
library(sandwich)
library(lmtest)
temp.lm <- dynlm(runif(100) ~ rnorm(100))
temp.summ <- summary(temp.lm)
temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest))

If you specify the covariance matrix, the F-statistics change and you need to compute it again using waldtest() right? Because
temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest))
only overwrites the coefficients.
F-statistics change but R^2 remains the same .

Related

Fixest and Standardized Coefficients from Easystats in R

I was wondering what would be the best way to calculate and present standardized coefficients using fixest. Here is what I tried using easystats
library(parameters)
library(effectsize)
library(fixest)
m <- lm(rating ~ complaints, data = attitude)
standardize_parameters(m, method="basic")# works
m <- feols(rating ~ complaints, data = attitude)
standardize_parameters(m, method="basic")# Error in stats::model.frame(model)[[1]] : subscript out of bounds
I also tried the modelsummary approach, but it shows unstandardized coefficients with no error.
library(parameters)
library(effectsize)
m <- lm(rating ~ complaints, data = attitude)
modelsummary(m, standardize="refit") # works, coeffs are different
m <- feols(rating ~ complaints, data = attitude)
modelsummary(m, standardize="refit")# doesn't work, coeffs are the same
Any insight or advice on how to elegantly and easily pull standardized coefficients out of fixest estimation results would be greatly appreciated. My goal is to replicate the venerable to use listcoef package in Stata. Many thanks to the authors of the packages mentioned in this post!
Edit: ``` > packageVersion("modelsummary")
[1] ‘1.1.0.9000’
One potential solution is to just manually calculate the standardized coefficients yourself, as [detailed here][1]. As an example, below I scale your predictor and outcome, then calculate the standardized beta coefficient of the only predictor in your model.
#### Scale Predictor and Outcome ####
scale.x <- sd(attitude$complaints)
scale.y <- sd(attitude$rating)
#### Obtain Standardized Coefficients ####
sb <- coef(m)[2] * scale.x / scale.y
sb
Which gives you this (you can ignore the column name, as it is just borrowing it from the original coef vector):
complaints
0.8254176
[1]: https://www.sciencedirect.com/topics/mathematics/standardized-regression-coefficient#:~:text=The%20standardized%20regression%20coefficient%2C%20found,one%20of%20its%20standardized%20units%20(

Cluster-Robust Standard Errors in Stargazer

Does anyone know how to get stargazer to display clustered SEs for lm models? (And the corresponding F-test?) If possible, I'd like to follow an approach similar to computing heteroskedasticity-robust SEs with sandwich and popping them into stargazer as in http://jakeruss.com/cheatsheets/stargazer.html#robust-standard-errors-replicating-statas-robust-option.
I'm using lm to get my regression models, and I'm clustering by firm (a factor variable that I'm not including in the regression models). I also have a bunch of NA values, which makes me think multiwayvcov is going to be the best package (see the bottom of landroni's answer here - Double clustered standard errors for panel data - and also https://sites.google.com/site/npgraham1/research/code)? Note that I do not want to use plm.
Edit: I think I found a solution using the multiwayvcov package...
library(lmtest) # load packages
library(multiwayvcov)
data(petersen) # load data
petersen$z <- petersen$y + 0.35 # create new variable
ols1 <- lm(y ~ x, data = petersen) # create models
ols2 <- lm(y ~ x + z, data = petersen)
cl.cov1 <- cluster.vcov(ols1, data$firmid) # cluster-robust SEs for ols1
cl.robust.se.1 <- sqrt(diag(cl.cov1))
cl.wald1 <- waldtest(ols1, vcov = cl.cov1)
cl.cov2 <- cluster.vcov(ols2, data$ticker) # cluster-robust SEs for ols2
cl.robust.se.2 <- sqrt(diag(cl.cov2))
cl.wald2 <- waldtest(ols2, vcov = cl.cov2)
stargazer(ols1, ols2, se=list(cl.robust.se.1, cl.robust.se.2), type = "text") # create table in stargazer
Only downside of this approach is you have to manually re-enter the F-stats from the waldtest() output for each model.
Using the packages lmtest and multiwayvcov causes a lot of unnecessary overhead. The easiest way to compute clustered standard errors in R is the modified summary() function. This function allows you to add an additional parameter, called cluster, to the conventional summary() function. The following post describes how to use this function to compute clustered standard errors in R:
https://economictheoryblog.com/2016/12/13/clustered-standard-errors-in-r/
You can easily the summary function to obtain clustered standard errors and add them to the stargazer output. Based on your example you could simply use the following code:
# estimate models
ols1 <- lm(y ~ x)
# summary with cluster-robust SEs
summary(ols1, cluster="cluster_id")
# create table in stargazer
stargazer(ols1, se=list(coef(summary(ols1,cluster = c("cluster_id")))[, 2]), type = "text")
I would recommend lfe package, which is much more powerful package than lm package. You can easily specify the cluster in the regression model:
ols1 <- felm(y ~ x + z|0|0|firmid, data = petersen)
summary(ols1)
stargazer(OLS1, type="html")
The clustered standard errors will be automatically produced. And stargazer will report the clustered-standard error accordingly.
By the way (allow me to do more marketing), for micro-econometric analysis, felm is highly recommended. You can specify fixed effects and IV easily using felm. The grammar is like:
ols1 <- felm(y ~ x + z|FixedEffect1 + FixedEffect2 | IV | Cluster, data = Data)

Stargazer: omit stars for constant only

Sometimes it's tacky to include statistical significance stars for the constant term when reporting the results of a regression. Is it possible to configure stargazer to keep stars for the regressors, but not for the constant term?
fit <- lm(rating ~ complaints, data=attitude)
stargazer(fit)
Basically, the answer turned out to be using stargazer's p argument. From there, I just needed to write a (series of) function(s) that took a list of regression fits and returned a list of vectors of p-values. I then manually changed the p-value of the intercepts to be 1, and presto, no tacky stars on the intercept. Plus it's reproducible with no manual LaTeX editing!
commarobust <- function(fit){
require(sandwich)
require(lmtest)
coeftest(fit,vcovHC(fit, type="HC2"))
}
getrobustps <- function(fit){
robustfit <- commarobust(fit)
ps <- robustfit[,4]
ps["(Intercept)"] <- 1
return(ps)
}
makerobustpslist <- function(fitlist){
return(lapply(fitlist, FUN=getrobustps) )
}
Then in the stargazer call:
stargazer(fit_1, fit_2, fit_3, fit_4, fit_5,
p=makerobustpslist(list(fit_1, fit_2, fit_3, fit_4, fit_5)))
Works like a charm.
You could alternatively use the broom package to convert the fit results to a data frame, and then add stars to your heart's content:
library("broom")
mod <- lm(mpg ~ wt + qsec, data = mtcars)
DF <- tidy(mod)
DF$stars <- c("", "***", "***") # inspect and add manually, or automate
And the xtable package could be used to format it for LaTeX or whatever.

Using coef and summary.lm with robcov in R (extracting p-values)

I can extract the p-values for my slope & intercept from an ols object this way:
library(rms)
m1 <- ols(wt ~ cyl, data= mtcars, x= TRUE, y= TRUE)
coef(summary.lm(m1))
But when I try the same thing with a robcov object, summary.lm gives me the p-values from the original model (m1), not the robcov model:
m2 <- robcov(m1)
m2
coef(summary.lm(m2))
I think this must be related to the Warning from the robcov help page,
Warnings
Adjusted ols fits do not have the corrected standard errors printed
with print.ols. Use sqrt(diag(adjfit$var)) to get this, where adjfit
is the result of robcov.
but I'm not sure how.
Is there a way to extract the p-values from a robcov object? (I'm really only interested in the one for the slope, if that makes a difference...)
Hacking through print.ols and prModFit, I came up with this.
errordf <- m2$df.residual
beta <- m2$coefficients
se <- sqrt(diag(m2$var))
Z <- beta/se
P <- 2 * (1 - pt(abs(Z), errordf))
Change m2 to another robcov model.
Try it for yourself by comparing the results of P to print(m2)

Sort xtable() output by p-value from glm model summary

I'm modelling a lot of data for different companies, and for each company I need to identify quickly those model parameters that are most significant. What I would like to see is xtable() output for a fitted model that sorts all coefficients in increasing order of p-value (ie, most significant parameters first).
x <- data.frame(a=rnorm(100), b=runif(100), c=rnorm(100), e=rnorm(100))
fit <- glm(a ~ ., data=x)
xtable(fit)
I'm guessing that I may be able to accomplish something like this by messing with the structure of the fit object. But I'm not familiar with the structure enough to be able to confidently change anything.
Suggestions?
Not necessarily the most elegant solution, but that should do the job:
data(birthwt, package="MASS")
glm.res <- glm(low ~ ., data=birthwt[,-10])
idx <- order(coef(summary(glm.res))[,4]) # sort out the p-values
out <- coef(summary(glm.res))[idx,] # reorder coef, SE, etc. by increasing p
library(xtable)
xtable(out)

Resources