Does anyone know how to get stargazer to display clustered SEs for lm models? (And the corresponding F-test?) If possible, I'd like to follow an approach similar to computing heteroskedasticity-robust SEs with sandwich and popping them into stargazer as in http://jakeruss.com/cheatsheets/stargazer.html#robust-standard-errors-replicating-statas-robust-option.
I'm using lm to get my regression models, and I'm clustering by firm (a factor variable that I'm not including in the regression models). I also have a bunch of NA values, which makes me think multiwayvcov is going to be the best package (see the bottom of landroni's answer here - Double clustered standard errors for panel data - and also https://sites.google.com/site/npgraham1/research/code)? Note that I do not want to use plm.
Edit: I think I found a solution using the multiwayvcov package...
library(lmtest) # load packages
library(multiwayvcov)
data(petersen) # load data
petersen$z <- petersen$y + 0.35 # create new variable
ols1 <- lm(y ~ x, data = petersen) # create models
ols2 <- lm(y ~ x + z, data = petersen)
cl.cov1 <- cluster.vcov(ols1, data$firmid) # cluster-robust SEs for ols1
cl.robust.se.1 <- sqrt(diag(cl.cov1))
cl.wald1 <- waldtest(ols1, vcov = cl.cov1)
cl.cov2 <- cluster.vcov(ols2, data$ticker) # cluster-robust SEs for ols2
cl.robust.se.2 <- sqrt(diag(cl.cov2))
cl.wald2 <- waldtest(ols2, vcov = cl.cov2)
stargazer(ols1, ols2, se=list(cl.robust.se.1, cl.robust.se.2), type = "text") # create table in stargazer
Only downside of this approach is you have to manually re-enter the F-stats from the waldtest() output for each model.
Using the packages lmtest and multiwayvcov causes a lot of unnecessary overhead. The easiest way to compute clustered standard errors in R is the modified summary() function. This function allows you to add an additional parameter, called cluster, to the conventional summary() function. The following post describes how to use this function to compute clustered standard errors in R:
https://economictheoryblog.com/2016/12/13/clustered-standard-errors-in-r/
You can easily the summary function to obtain clustered standard errors and add them to the stargazer output. Based on your example you could simply use the following code:
# estimate models
ols1 <- lm(y ~ x)
# summary with cluster-robust SEs
summary(ols1, cluster="cluster_id")
# create table in stargazer
stargazer(ols1, se=list(coef(summary(ols1,cluster = c("cluster_id")))[, 2]), type = "text")
I would recommend lfe package, which is much more powerful package than lm package. You can easily specify the cluster in the regression model:
ols1 <- felm(y ~ x + z|0|0|firmid, data = petersen)
summary(ols1)
stargazer(OLS1, type="html")
The clustered standard errors will be automatically produced. And stargazer will report the clustered-standard error accordingly.
By the way (allow me to do more marketing), for micro-econometric analysis, felm is highly recommended. You can specify fixed effects and IV easily using felm. The grammar is like:
ols1 <- felm(y ~ x + z|FixedEffect1 + FixedEffect2 | IV | Cluster, data = Data)
Related
I'm trying to model raw data by an asymptotic function with the equation $$f(x) = a + (b-a)(1-\exp(-c x))$$ using R. To do so I used the following code:
rawData <- import("path/StackTestData.tsv")
# executing regression
X <- rawData$x
Y <- rawData$y
model <- drm(Y ~ X, fct = DRC.asymReg())
# creating the regression function
f_0_ <- model$coefficients[1] #value for y if x=0
steepness <- model$coefficients[2]
plateau <- model$coefficients[3]
eq <- function(x){f_0_+(plateau-f_0_)*(1-exp(-steepness*x))}
# plotting the regression function together with the raw data
ggplot(rawData,aes(x=x,y=y)) +
geom_line(col="red") +
stat_function(fun=eq,col="blue") +
ylim(10,12.5)
In some cases, I got a proper regression function. However, with the attached data I don't get one. The regression function is not showing any correlation with the raw data whatsoever, as shown in the figure below. Can you perhaps offer a better solution for performing the asymptotic regression or do you know where the error lies?
Best Max
R4.1.2 was used using R Studio 1.4.1106. For ggplot the package ggpubr, for DRC.asymReg() the packages aomisc and drc were load.
I would like to create a table like the tables from the stargazer Package.
But using the Gravity Package for creating gravity models, this package isn´t supported by the stargazer package yet.
Do you have an idea, how to create a similar table with 3-5 models side by side for better comparison?
Output should look like this, just with gravity models from the gravity package in r:
Desired Output Style:
Please provide an example of model object created by gravity package.
Alternatively,
I will show one approach that can be used: stargazer is really nice and you CAN even create table like above even with the model objects that are not yet supported, e.g. lets say that quantile regression model is not supported by stargazer (even thought is is):
Trick is, you need to be able to obtain coefficients and standart error e.g. as vector. Then supply stargazer with model object that is suppoerted e.g. lm as a template and then mechanically specify which coefficients and standart errors should be used:
library(stargazer)
library(tidyverse)
library(quantreg)
df <- mtcars
model1 <- lm(hp ~ factor(gear) + qsec + disp, data = df)
quantreg <- rq(hp ~ factor(gear) + qsec + disp, data = df)
summary_qr <- summary(quantreg, se = "boot")
# Standart Error for quant reg
se_qr = c(211.78266, 29.17307, 58.61105, 9.70908, 0.12090)
stargazer(model1, model1,
coef = list(NULL, summary_qr$coefficients),
se = list(NULL, se_qr),
type = "text")
I want to estimate a fixed effects model while using panel-corrected standard errors as well as Prais-Winsten (AR1) transformation in order to solve panel heteroscedasticity, contemporaneous spatial correlation and autocorrelation.
I have time-series cross-section data and want to perform regression analysis. I was able to estimate a fixed effects model, panel corrected standard errors and Prais-winsten estimates individually. And I was able to include panel corrected standard errors in a fixed effects model. But I want them all at once.
# Basic ols model
ols1 <- lm(y ~ x1 + x2, data = data)
summary(ols1)
# Fixed effects model
library('plm')
plm1 <- plm(y ~ x1 + x2, data = data, model = 'within')
summary(plm1)
# Panel Corrected Standard Errors
library(pcse)
lm.pcse1 <- pcse(ols1, groupN = Country, groupT = Time)
summary(lm.pcse1)
# Prais-Winsten estimates
library(prais)
prais1 <- prais_winsten(y ~ x1 + x2, data = data)
summary(prais1)
# Combination of Fixed effects and Panel Corrected Standard Errors
ols.fe <- lm(y ~ x1 + x2 + factor(Country) - 1, data = data)
pcse.fe <- pcse(ols.fe, groupN = Country, groupT = Time)
summary(pcse.fe)
In the Stata command: xtpcse it is possible to include both panel corrected standard errors and Prais-Winsten corrected estimates, with something allong the following code:
xtpcse y x x x i.cc, c(ar1)
I would like to achieve this in R as well.
I am not sure that my answer will completely address your concern, these days I've been trying to deal with the same problem that you mention.
In my case, I ran the Prais-Winsten function from the package prais where I included my model with the fixed effects. Afterwards, I correct for heteroskedasticity using the function vcovHC.prais which is analogous to vcovHC function from the package sandwich.
This basically will give you White's/sandwich heteroskedasticity-consistent covariance matrix which, if you later fit into the function coeftest from the package lmtest, it will give you the table output with the corrected standard errors. Taking your posted example, see below the code that I have used:
# Prais-Winsten estimates with Fixed Effects
library(prais)
prais.fe <- prais_winsten(y ~ x1 + x2 + factor(Country), data = data)
library(lmtest)
prais.fe.w <- coeftest(prais.fe, vcov = vcovHC.prais(prais.fe, "HC1")
h.m1 # run the object to see the output with the corrected standard errors.
Alas, I am aware that the sandwhich heteroskedasticity-consistent standard errors are not exactly the same as the Beck and Katz's PCSEs because PCSE deals with panel heteroskedasticity while sandwhich SEs addresses overall heteroskedasticity. I am not totally sure in how much these two differ in practice, but something is something.
I hope my answer was somehow helpful, this is actually my very first answer :D
I am running multiple linear regression models and I would like to loop through the results to subsequently generate robust standard errors. My code currently looks like this, but I want to run multiple models and not have to copy the code for calculating robust standard errors for each model.
# load data
data(mtcars)
# run models
m1 <- lm("mpg ~ wt", data = mtcars)
m2 <- lm("mpg ~ wt + hp", data = mtcars)
# calculate robust standard errors
cov1 <- vcovHC(m1, type = "HC3")
robust_se1 <- sqrt(diag(cov1))
cov2 <- vcovHC(m2, type = "HC3")
robust_se2 <- sqrt(diag(cov2))
How could I write a function to handle this task. I plan to number each model using successive integers, e.g., m1, m2, m3. I have not so far been able to adapt related SO answers for generating variables using a loop, like this one.
Edits: changed to executable code.
I am using NeweyWest standard errors to correct my lm() / dynlm() output. E.g.:
fit1<-dynlm(depvar~covariate1+covariate2)
coeftest(fit1,vcov=NeweyWest)
Coefficients are displayed the way I´d like to, but unfortunately I loose all the regression output information like R squared, F-Test etc. that is displayed by summary. So I wonder how I can display robust se and all the other stuff in the same summary output.
Is there a way to either get everything in one call or to overwrite the 'old' estimates?
I bet I just missed something badly, but that is really relevant when sweaving the output.
Test example, taken from ?dynlm.
require(dynlm)
require(sandwich)
data("UKDriverDeaths", package = "datasets")
uk <- log10(UKDriverDeaths)
dfm <- dynlm(uk ~ L(uk, 1) + L(uk, 12))
#shows R-squared, etc.
summary(dfm)
#no such information
coeftest(dfm, vcov = NeweyWest)
btw.: same applies for vcovHC
coefficients is just a matrix in the lm (or dynlm) summary object, so all you need to do is unclass the coeftest() output.
library(dynlm)
library(sandwich)
library(lmtest)
temp.lm <- dynlm(runif(100) ~ rnorm(100))
temp.summ <- summary(temp.lm)
temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest))
If you specify the covariance matrix, the F-statistics change and you need to compute it again using waldtest() right? Because
temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest))
only overwrites the coefficients.
F-statistics change but R^2 remains the same .