I would like to display the t value and p value from my regression output using the stargazer package. So far, I've found ways to show one or the other. I did find something online that displays both; however, it omits the coefficient names. Is there a function or something else that will enable me to show all three?
library(stargazer)
data("cars")
model <- lm(hp ~ wt, mtcars)
stargazer(model, type='text',report = "csp")
Shows p-values, but no beta coefficient names
shows beta coefficient names, but no pvalues. I would like to keep all three: coefficient names, t stat, and p-value
edit
removed a function used on the website that I did not use
You could mention it by using t in the report argument:
a character string containing only elements of "v", "c", "s","t", "p",
"" that determines whether, and in which order, variable names ("v"),
coefficients ("c"), standard errors/confidence intervals ("s"), test
statistics ("t") and p-values ("p") should be reported in regression
tables. If one of the aforementioned letters is followed by an
asterisk (""), significance stars will be reported next to the
corresponding statistic.
Here is a reproducible example:
library(stargazer)
data("cars")
model <- lm(hp ~ wt, mtcars)
stargazer(model, type='text', report = "vcstp")
#>
#> ===============================================
#> Dependent variable:
#> ---------------------------
#> hp
#> -----------------------------------------------
#> wt 46.160
#> (9.625)
#> t = 4.796
#> p = 0.00005
#>
#> Constant -1.821
#> (32.325)
#> t = -0.056
#> p = 0.956
#>
#> -----------------------------------------------
#> Observations 32
#> R2 0.434
#> Adjusted R2 0.415
#> Residual Std. Error 52.437 (df = 30)
#> F Statistic 22.999*** (df = 1; 30)
#> ===============================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
Created on 2022-11-12 with reprex v2.0.2
Related
I want to create a gls regression that includes the value R squared and observations where the values "log likelihood" etc. are. The p values should be below the coefficients in the table. Here is an example of a code:
`
# import the necessary packages
library(nlme)
library(dplyr)
library(stargazer)
# create a new subset that only includes observations with a value in the "Price.Book.Value" column
dotcom_subset_MBV <- dotcom_subset %>% filter(!is.na(Price.Book.Value))
financial_subset_MBV <- financial_subset %>% filter(!is.na(Price.Book.Value))
covid_subset_MBV <- covid_subset %>% filter(!is.na(Price.Book.Value))
# Hypothesis 2: Fit GLS models
dotcom_model_MBV <- gls(X1.Month.Equity.Premium ~ crisis*Price.Book.Value, data = dotcom_subset_MBV, method = "ML")
financial_model_MBV <- gls(X1.Month.Equity.Premium ~ crisis*Price.Book.Value, data = financial_subset_MBV, method = "ML")
covid_model_MBV <- gls(X1.Month.Equity.Premium ~ crisis*Price.Book.Value, data = covid_subset_MBV, method = "ML")
library(stargazer)
stargazer(dotcom_model_MBV, financial_model_MBV, covid_model_MBV, type = "text",column.labels = c("Dotcom","Financial","Covid"),report=('vc*p'))
The only problem with the code above is that it shows the Log Likelihood, Akaike Inf. Crit. and Bayesian Inf. Crit. instead of the R squared values. The rest would be okay.
I tried the following:
omit.stat = c("ll", "AIC", "BIC")
and it works. However, it still doesn't show me the R squared. Then I tried:
add.lines = list(c(paste0("R-squared = ", round(r2_dotcom, 2)
and it includes a line that is called "R Squared" but without any values.
Here's a working example using the mtcars data:
data(mtcars)
library(nlme)
library(stargazer)
#>
#> Please cite as:
#> Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
#> R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
m1 <- gls(qsec ~ cyl + wt, data=mtcars)
m2 <- gls(mpg ~ cyl + wt, data=mtcars)
r2 <- c(cor(fitted(m1), mtcars$qsec)^2,
cor(fitted(m2), mtcars$mpg)^2)
stargazer(m1, m2,
type="text",
omit.stat = c("ll", "AIC", "BIC"),
add.lines = list(c("R-squared", sprintf("%.2f", r2))))
#>
#> =========================================
#> Dependent variable:
#> ----------------------------
#> qsec mpg
#> (1) (2)
#> -----------------------------------------
#> cyl -1.173*** -1.508***
#> (0.197) (0.415)
#>
#> wt 1.356*** -3.191***
#> (0.360) (0.757)
#>
#> Constant 20.743*** 39.686***
#> (0.815) (1.715)
#>
#> -----------------------------------------
#> R-squared 0.56 0.83
#> Observations 32 32
#> =========================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
Created on 2023-02-01 by the reprex package (v2.0.1)
The add.lines option expects a list and each vector in the list will be appended to the table with a different element in each column. So you would need the vector of values to be "R-squared" and then each of the r-squared values printed in a string (that's what sprintf() does).
Also, note, that I've calculated the r-squared as the squared correlation between the observed and fitted values, but make no claim that this is statistically sound (though it is one way of calculating R-squared in the OLS model).
I have fitted a binominal logistic glm with a three-way interaction between sex (male & female), tree cover including a quadratic term (1-100%), and the mean tree cover of an area (1-100%).
(case is 1 used and 0 for not used)
glm.winter.3 <- glm(case ~
sex. * mean95 * poly(tree.cover,2),
data = rsf.winter.3, family = binomial (link = "logit"))
I found a nice plot in a paper. I would like to do something similar but I can not find a way to approach it :
My data set is large. So it's hard to share it. Maybe somebody has an idea how to approach it anyway? Thanks
Here's an example with some built-in data. Using the mtcars data, I made a binary variable identifying gas guzzlers gg which is 1 for those cars that get less than 19 MPG and 0 for all others. I then modelled it as a multiplicative function of wt (weight), hp (horsepower) and vs whether or not it is a v-shaped engine. Here are the results:
library(lattice)
library(dplyr)
data(mtcars)
mtcars <- mtcars %>%
mutate(fast = as.numeric(qsec < mean(qsec)),
gg = as.numeric(mpg< 19),
carb2 = ifelse(carb <= 2, 0, 1))
mod <- glm(gg ~ hp* wt * vs, data=mtcars, family=binomial)
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(mod)
#>
#> Call:
#> glm(formula = gg ~ hp * wt * vs, family = binomial, data = mtcars)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -2.24867 -0.00002 0.00000 0.28763 1.17741
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -3.299e+01 4.336e+01 -0.761 0.447
#> hp 1.346e-01 2.341e-01 0.575 0.565
#> wt 8.744e+00 1.271e+01 0.688 0.491
#> vs -1.447e+03 5.600e+05 -0.003 0.998
#> hp:wt -3.231e-02 6.798e-02 -0.475 0.635
#> hp:vs 9.144e+00 6.074e+03 0.002 0.999
#> wt:vs 4.512e+02 1.701e+05 0.003 0.998
#> hp:wt:vs -2.906e+00 1.815e+03 -0.002 0.999
#>
#> (Dispersion parameter for binomial family taken to be 1)
#>
#> Null deviance: 44.236 on 31 degrees of freedom
#> Residual deviance: 11.506 on 24 degrees of freedom
#> AIC: 27.506
#>
#> Number of Fisher Scoring iterations: 22
The next step is to make sequences of the two continuous variables that go from their minima to maxima across some number of values - usually 25 is enough to give a smooth looking graph.
hp_seq <- seq(min(mtcars$hp, na.rm=TRUE),
max(mtcars$hp, na.rm=TRUE),
length=25)
wt_seq <- seq(min(mtcars$wt, na.rm=TRUE),
max(mtcars$wt, na.rm=TRUE),
length=25)
Next, we make all combinations of the two sequences of values and the binary variable (in this case vs).
eg <- expand.grid(hp = hp_seq,
wt = wt_seq,
vs = c(0,1))
We can then make predictions from those data and save them back into the data object. We also make vs a factor - the levels of the factor will show up in the strip above each plot.
eg$fit <- predict(mod, newdata=eg, type="response")
eg$vs <- factor(eg$vs, labels=c("VS = 0", "VS = 1"))
We use wireframe from the lattice package to make the plots
wireframe(fit ~ hp + wt | vs, data=eg, drape=TRUE,
default.scales = list(arrows=FALSE))
Created on 2022-05-06 by the reprex package (v2.0.1)
Note, there is another answer about making 3D surface plots here that demonstrates perp from base R and plot_ly from the plotly package and persp3D from the plot3D package.
I'd like to create an LM with HC3 corrected standard errors and a fixest model with cluster robust standard errors in the same table.
see my MRE below:
df <- mtcars
models <- list()
models[[1]] <- lm(cyl~disp, data = df)
models[[2]] <- feols(cyl~disp|as.factor(gear), data = df)
library(modelsummary)
# this works
modelsummary::modelsummary(models)
# but these do not
modelsummary::modelsummary(models, vcov = c("HC3", "cluster"))
modelsummary::modelsummary(models, vcov = c(HC3, cluster))
modelsummary::modelsummary(models, vcov = list(HC3, cluster))
modelsummary::modelsummary(models, vcov = list(vcovHC, cluster))
modelsummary::modelsummary(models, vcov = list(vcovHC, vcovHC))
modelsummary::modelsummary(models, vcov = c(vcovHC, vcovHC))
Okay--I figured out a hack, but still leaving question open in case someone finds out a more slick solution.
df <- mtcars
models <- list()
fit <- lm(cyl~disp, data = df)
models[[1]] <- coeftest(fit, vcovHC(fit, type = "HC3"))
models[[2]] <- summary(feols(cyl~disp|as.factor(gear), data = df), "cluster")
library(modelsummary)
# this works
modelsummary::modelsummary(models)
By default fixest automatically computes cluster-robust standard errors when you include fixed effects. If you set vcov=NULL, modelsummary will return the default standard errors, which will then be cluster-robust.
Alternatively, you can set vcov=~gear to compute the standard errors using the sandwich::vcovCL function under the hood.
Using version 0.10.0 of modelsummary and 0.10.4 of fixest, we can do:
library(modelsummary)
library(fixest)
models <- list(
lm(cyl~disp, data = mtcars),
feols(cyl ~ disp | gear, data = mtcars),
feols(cyl ~ disp | gear, data = mtcars))
modelsummary(models,
fmt = 6,
vcov = list("HC3", NULL, ~gear))
Model 1
Model 2
Model 3
(Intercept)
3.188568
(0.289130)
disp
0.012998
0.012061
0.012061
(0.001310)
(0.002803)
(0.002803)
Num.Obs.
32
32
32
R2
0.814
0.819
0.819
R2 Adj.
0.807
0.800
0.800
R2 Within
0.614
0.614
R2 Pseudo
AIC
79.1
80.2
80.2
BIC
83.5
86.1
86.1
Log.Lik.
-36.573
-36.107
-36.107
F
98.503
Std.Errors
HC3
by: gear
by: gear
FE: gear
X
X
Notice that the results are the same in the 2nd and 3rd column, and also equal to the plain fixest summary:
summary(models[[2]])
#> OLS estimation, Dep. Var.: cyl
#> Observations: 32
#> Fixed-effects: gear: 3
#> Standard-errors: Clustered (gear)
#> Estimate Std. Error t value Pr(>|t|)
#> disp 0.012061 0.002803 4.30299 0.049993 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 0.747808 Adj. R2: 0.799623
#> Within R2: 0.614334
In some past versions of modelsummary there were issues with labelling of the standard errors at the bottom of the table. I will soon be working on a more robust system to make sure the label matches the standard errors. In most cases, modelsummary calculates the robust standard errors itself, so it has full control over labelling and computation. Packages like fixest and estimatr make things a bit more tricky because they sometimes hold several SEs, and because the default is not always “classical”.
Forgive me as I'm brand new to R and if this is silly/easy, but I've been looking for hours but to no avail.
I have a series of GLM models, and I'd like to report the standardized/reparametrized coefficients for each alongside the direct coefficients in a Stargazer table. I created two separate models, one with standardized coefficients using the arm package.
require(arm)
model1 <- glm(...)
model1.2 <- standardize(model1)
Both models work, find and give the outputs I want, but
I can't seem to figure out a way to get Stargazer to emulate this structure/look:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.699&rep=rep1&type=pdf
This can be done by asking stargazer to produce output for both models and making sure that the coefficients in the models have the same names so that stargazer knows that the standardized coefficient and the un-standardized coefficient should go on the same row.
The code below should help you get started.
# generate fake data
x <- runif(100)
y <- rbinom(100, 1, x)
# fit the model
m1 <- glm(y~x, family = binomial())
# standardize it
m2 <- arm::standardize(m1)
# we make sure the coefficients have the same names in both models
names(m2$coefficients) <- names(m1$coefficients)
# we feed to stargazer
stargazer::stargazer(m1, m2, type = "text",
column.labels = c("coef (s.e.)", "standarized coef (s.e.)"),
single.row = TRUE)
#>
#> ===========================================================
#> Dependent variable:
#> -----------------------------------------
#> y
#> coef (s.e.) standarized coef (s.e.)
#> (1) (2)
#> -----------------------------------------------------------
#> x 4.916*** (0.987) 2.947*** (0.592)
#> Constant -2.123*** (0.506) 0.248 (0.247)
#> -----------------------------------------------------------
#> Observations 100 100
#> Log Likelihood -50.851 -50.851
#> Akaike Inf. Crit. 105.703 105.703
#> ===========================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
Created on 2019-02-13 by the reprex package (v0.2.1)
You can usually figure out how to achieve what you want from the output by diving into the stargazer help file and looking at this helpful webpage https://www.jakeruss.com/cheatsheets/stargazer/
I have a glm model for which I use coeftest from the lmtest package to estimate robust standard errors. When I use stargazer to produce regression tables I get the correct results but without the number of observations and other relevant statistics like the null deviance and the model deviance.
Here's an example:
library(lmtest)
library(stargazer)
m1 <- glm(am ~ mpg + cyl + disp, mtcars, family = binomial)
# Simple binomial regression
# For whatever reason, let's say I want to use coeftest to estimate something
m <- coeftest(m1)
stargazer(m, type = "text", single.row = T) # This is fine, but I want to also include the number of observations
# the null deviance and the model deviance.
I'm specifically interested in the number of observations, the null deviance and the residual deviance.
I thought that If I replaced the old coefficient matrix with the new one, I'd get the correct estimates with the correct statistics and stargazer would recognize the model and print it correctly. For that, I've tried substituting the coefficients, SE's, z statistic and p values from the coeftest model in the m1 model but some of these statistics are computed with summary.glm and are not included in the m1 output. I could easily substitute these coefficients in the summary output but stargazer doesn't recognize summary type class. I've tried adding attributes to the m object with the specific statistics but they don't show up in the output and stargazer doesn't recognize it.
Note: I know stargazer can compute robust SE's but I'm also doing other computations, so the example needs to include the coeftest output.
Any help is appreciated.
It may be easiest to pass the original models into stargazer, and then use coeftest to pass in custom values for standard errors (se = ), confidence intervals (ci.custom = ) and/or p values (p = ). See below for how to easily handle a list containing multiple models.
suppressPackageStartupMessages(library(lmtest))
suppressPackageStartupMessages(library(stargazer))
mdls <- list(
m1 = glm(am ~ mpg, mtcars, family = poisson),
m2 = glm(am ~ mpg + cyl + disp, mtcars, family = poisson)
)
# Calculate robust confidence intervals
se_robust <- function(x)
coeftest(x, vcov. = sandwich::sandwich)[, "Std. Error"]
# Original SE
stargazer(mdls, type = "text", single.row = T, report = "vcsp")
#>
#> ===============================================
#> Dependent variable:
#> -----------------------------
#> am
#> (1) (2)
#> -----------------------------------------------
#> mpg 0.106 (0.042) 0.028 (0.083)
#> p = 0.012 p = 0.742
#> cyl 0.435 (0.496)
#> p = 0.381
#> disp -0.014 (0.009)
#> p = 0.151
#> Constant -3.247 (1.064) -1.488 (3.411)
#> p = 0.003 p = 0.663
#> -----------------------------------------------
#> Observations 32 32
#> Log Likelihood -21.647 -20.299
#> Akaike Inf. Crit. 47.293 48.598
#> ===============================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
# With robust SE
stargazer(
mdls, type = "text", single.row = TRUE, report = "vcsp",
se = lapply(mdls, se_robust))
#>
#> ===============================================
#> Dependent variable:
#> -----------------------------
#> am
#> (1) (2)
#> -----------------------------------------------
#> mpg 0.106 (0.025) 0.028 (0.047)
#> p = 0.00002 p = 0.560
#> cyl 0.435 (0.292)
#> p = 0.137
#> disp -0.014 (0.007)
#> p = 0.042
#> Constant -3.247 (0.737) -1.488 (2.162)
#> p = 0.00002 p = 0.492
#> -----------------------------------------------
#> Observations 32 32
#> Log Likelihood -21.647 -20.299
#> Akaike Inf. Crit. 47.293 48.598
#> ===============================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
Created on 2020-11-09 by the reprex package (v0.3.0)
If I get you right, you could try the following:
First, assign your stargazer analysis to an object like this
stargazer.values <- stargazer(m, type = "text", single.row = T)
then check the code of the stargazer command with body(stargazer).
Hopefully you can find objects for values that stargazers uses but does not report. You can then address them like this (if there is, for example, an object named "null.deviance"
stargazers.values$null.deviance
Or, if it is part of another data frame, say df, it could go like this
stargazers.values$df$null.deviance
maybe a code like this could be helpful
print(null.deviance <- stargazers.values$null.deviance)
Hope this helps!