How to rename complicated variable name in fixest etable - r

I'm wondering how to change a complex variable name with dict in etable in fixest package.
For example, I have a regression Y ~ x1 + x2:abs(x3):x4 and I'd like to change the name of x2:abs(x3):x4.
I have tried
etable(...,
dict = c(`x2:abs(x3):x4` = 'myvar')
)
etable(...,
dict = c("x2:abs(x3):x4" = 'myvar')
)
etable(...,
dict = c("x2*abs(x3)*x4" = 'myvar')
)
But no success. Is there a easy fix for this?

It works. It's likely a version problem:
library(fixest)
est = feols(mpg ~ cyl:abs(disp):hp, mtcars)
etable(est, dict=c("cyl:abs(disp):hp" = "New coef"))
#> est
#> Dependent Var.: mpg
#>
#> (Intercept) 25.05*** (0.9560)
#> New coef -1.65e-5*** (2.3e-6)
#> _______________ ____________________
#> S.E. type Standard
#> Observations 32
#> R2 0.63073
#> Adj. R2 0.61842
Otherwise, please provide a minimal reproducible example.

Related

How to write a for loop which creates a model and has a function which references that same model

I am trying to run a post hoc analysis on an unbalanced two way anova using the anova_test funciton in the rstatix package. I need to run this post hoc test iteratively, as I have ~26 response (y) variables. My first step is to create models of all my y variables with relation to group and treatment. I have successfully managed to do this, creating a single list with 26 models:
models <- map(data[,y1:y26], ~(lm(.x ~data$group*data$treatment)))
Now comes the part I'm stuck on. Referring to these models iteratively. I would like to run the following code for every y variable I have:
group_by(group) %>%
anova_test(y ~ treatment, error = models(y), type = 3)
where my y changes every time and as it does, the "model" (referred to in the error = term) is updated accordingly. I'm struggling with this bit since first set of models I make is used to inform the second set of models.
However, if I run just one y variable through this whole bit of code at one time, I get the appropriate results.
model <- lm(y ~ group*treatment, data = data)
data %>%
group_by(group) %>%
anova_test(y ~ treatment, error = model, type = 3)
I have tried creating a for loop as well as using the map function in the purrr package but I have been unsuccessful. I am new to for loops and purrr so I am sure it's a simple fix I just can't see it.
Basically I want a way to run
data %>%
group_by(group) %>%
anova_test(y ~ treatment, error = model, type = 3)
iteratively for different y variables (y1, y2, ..., y26) while also referring to the approprite model (model$y1, model$y2, ..., model$26).
Thanks for your help!
Well you didn't give any data so let's use toothgrowth. You seem to like the model format, so let's build a list of models. You could do this in an automated fashion but to make it clear lets do it by hand. The call purrr::map with the anova_test function. You'll get a list back. Since you're in charge of naming the list elements go to town.
Updated answer May 18th. Now using map2 since you want two different models passed build a list for each...
library(rstatix)
library(purrr)
ToothGrowth$len2 <- ToothGrowth$len^2 # for variety
models <- list(model1 = lm(len ~ supp*dose, ToothGrowth),
model2 = lm(len ~ dose*supp, ToothGrowth),
model3 = lm(len2 ~ dose*supp, ToothGrowth),
model4 = lm(len2 ~ supp*dose, ToothGrowth))
models2 <- list(model1 = lm(len ~ supp, ToothGrowth),
model2 = lm(len ~ dose, ToothGrowth),
model3 = lm(len2 ~ dose, ToothGrowth),
model4 = lm(len2 ~ supp, ToothGrowth))
# one model
purrr::map(models, ~ anova_test(.x, type = 3))
# now with model for error term
purrr::map2(models, models2, ~ anova_test(.x, error = .y, type = 3))
#> Coefficient covariances computed by hccm()
#> Coefficient covariances computed by hccm()
#> Coefficient covariances computed by hccm()
#> Coefficient covariances computed by hccm()
#> $model1
#> ANOVA Table (type III tests)
#>
#> Effect DFn DFd F p p<.05 ges
#> 1 supp 1 58 4.058 0.049000 * 0.065
#> 2 dose 1 58 12.717 0.000734 * 0.180
#> 3 supp:dose 1 58 1.588 0.213000 0.027
#>
#> $model2
#> ANOVA Table (type III tests)
#>
#> Effect DFn DFd F p p<.05 ges
#> 1 dose 1 58 33.626 2.92e-07 * 0.367
#> 2 supp 1 58 10.729 2.00e-03 * 0.156
#> 3 dose:supp 1 58 4.200 4.50e-02 * 0.068
#>
#> $model3
#> ANOVA Table (type III tests)
#>
#> Effect DFn DFd F p p<.05 ges
#> 1 dose 1 58 36.028 1.35e-07 * 0.383
#> 2 supp 1 58 7.128 1.00e-02 * 0.109
#> 3 dose:supp 1 58 2.709 1.05e-01 0.045
#>
#> $model4
#> ANOVA Table (type III tests)
#>
#> Effect DFn DFd F p p<.05 ges
#> 1 supp 1 58 2.684 0.107000 0.044
#> 2 dose 1 58 13.566 0.000508 * 0.190
#> 3 supp:dose 1 58 1.020 0.317000 0.017
Thanks to Nirgrahamuk from the rstudio community forum for this answer:
map(names(models_1) ,
~ anova_test(data=group_by(df,edge),
formula = as.formula(paste0(.x,"~ trt")),
error = models_1[[.x]],
type = 3))
(see their full answer at: https://community.rstudio.com/t/trouble-using-group-by-and-map2-together/66730/8?u=mvula)
Created on 2020-05-20 by the reprex package (v0.3.0)

How to get between and overall R2 from plm FE regression with stargazer?

Disclaimer: This question is extremely related to this one I asked two days ago - but now it relates to the implementation of between and overall R2 in stargazer() output not in summary() as before.
Is there a way to get plm() to calculate between R2 and overall R2 for me and include them in the stargazer() output?
To clarify what I mean with between, overall, and within R2 see this answer on StackExchange.
My understanding is that plm only calculates within R2.
I am running a Twoways effects Within Model.
library(plm)
library(stargazer)
# Create some random data
set.seed(1)
x=rnorm(100); fe=rep(rnorm(10),each=10); id=rep(1:10,each=10); ti=rep(1:10,10); e=rnorm(100)
y=x+fe+e
data=data.frame(y,x,id,ti)
# Get plm within R2
reg=plm(y~x,model="within",index=c("id","ti"), effect = "twoways", data=data)
stargazer(reg)
I now also want to include between and overall R2 in the stargazer() output. How can I do that?
To make it explicit what I mean with between and overall R2:
# Pooled Version (overall R2)
reg1=lm(y~x)
summary(reg1)$r.squared
# Between R2
y.means=tapply(y,id,mean)[id]
x.means=tapply(x,id,mean)[id]
reg2=lm(y.means~x.means)
summary(reg2)$r.squared
To do this in stargazer, you can use the add.lines() argument. However, this adds the lines to the beginning of the summary stats section and there is no way to alter this without messing with the source code, which is beastly. I much prefer huxtable, which provides a grammar of table building and is much more extensible and customizable.
library(tidyverse)
library(plm)
library(huxtable)
# Create some random data
set.seed(1)
x=rnorm(100); fe=rep(rnorm(10),each=10); id=rep(1:10,each=10); ti=rep(1:10,10); e=rnorm(100)
y=x+fe+e
data=data.frame(y,x,id,ti)
# Get plm within R2
reg=plm(y~x,model="within",index=c("id","ti"), effect = "twoways", data=data)
stargazer(reg, type = "text",
add.lines = list(c("Overall R2", round(r.squared(reg, model = "pooled"), 3)),
c("Between R2", round(r.squared(update(reg, effect = "individual", model = "between")), 3))))
#>
#> ========================================
#> Dependent variable:
#> ---------------------------
#> y
#> ----------------------------------------
#> x 1.128***
#> (0.113)
#>
#> ----------------------------------------
#> Overall R2 0.337
#> Between R2 0.174
#> Observations 100
#> R2 0.554
#> Adjusted R2 0.448
#> F Statistic 99.483*** (df = 1; 80)
#> ========================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
# I prefer huxreg, which is much more customizable!
# Create a data frame of the R2 values
r2s <- tibble(
name = c("Overall R2", "Between R2"),
value = c(r.squared(reg, model = "pooled"),
r.squared(update(reg, effect = "individual", model = "between"))))
tab <- huxreg(reg) %>%
# Add new R2 values
add_rows(hux(r2s), after = 4)
# Rename R2
tab[7, 1] <- "Within R2"
tab %>% huxtable::print_screen()
#> ─────────────────────────────────────────────────
#> (1)
#> ─────────────────────────
#> x 1.128 ***
#> (0.113)   
#> ─────────────────────────
#> N 100        
#> Overall R2 0.337    
#> Between R2 0.174    
#> Within R2 0.554    
#> ─────────────────────────────────────────────────
#> *** p < 0.001; ** p < 0.01; * p < 0.05.
#>
#> Column names: names, model1
Created on 2020-04-08 by the reprex package (v0.3.0)

Stargazer and t-Statistics

Does anyone know how to change to the format of t-stats in stargazer? I tried a bunch of things but haven't had any luck.
I would like the t-statistics shown below the coefficient and in brackets? i.e. drop the "t =" and replace with the t-statistic being shown in inside ( xxxx)
For example:
(1)
Variable 1 0.102
t = 3.494
I would like
(1)
Variable 1 0.102
(3.494)
If you are interested in a non-stargazer option, you might want to try the modelsummary package (disclaimer: I am the author). modelsummary accepts strings enclosed in curly braces in the glue package format, so you can do a bunch of weird things. You can read the details at this link, but here's an example:
library(modelsummary)
models <- list(
lm(hp ~ mpg, data = mtcars),
lm(hp ~ mpg + drat, data = mtcars))
modelsummary(models,
statistic = c(
"statistic",
"conf.int",
"Std. Error: {std.error}{stars}"))
There is no real option to do this with stargazer as the format of t-statistics is hard-coded.
Instead, replace the standard errors with the t-statistics and override p-values so the right stars appear.
I did this for multiple models below, as this is the more general solution (it works for one model too).
library(stargazer)
#>
#> Please cite as:
#> Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
#> R package version 5.2.2. https://CRAN.R-project.org/package=stargazer
models <- list()
models[[1]] <- lm(mpg ~ cyl + disp, data = mtcars)
models[[2]] <- lm(mpg ~ cyl + disp + wt, data = mtcars)
get_ts <- function(fm) {
summary(fm)$coefficients[,3]
}
get_pvals <- function(fm) {
summary(fm)$coefficients[,4]
}
ts <- lapply(models, get_ts)
pvals <- lapply(models, get_pvals)
stargazer(models, type = "text", report=('vc*s'), se = ts, p = pvals)
#>
#> =================================================================
#> Dependent variable:
#> ---------------------------------------------
#> mpg
#> (1) (2)
#> -----------------------------------------------------------------
#> cyl -1.587** -1.785***
#> (-2.230) (-2.940)
#>
#> disp -0.021* 0.007
#> (-2.007) (0.631)
#>
#> wt -3.636***
#> (-3.495)
#>
#> Constant 34.661*** 41.108***
#> (13.609) (14.462)
#>
#> -----------------------------------------------------------------
#> Observations 32 32
#> R2 0.760 0.833
#> Adjusted R2 0.743 0.815
#> Residual Std. Error 3.055 (df = 29) 2.595 (df = 28)
#> F Statistic 45.808*** (df = 2; 29) 46.424*** (df = 3; 28)
#> =================================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
Created on 2021-05-11 by the reprex package (v2.0.0)
Can you give a minimal reproduciable example? I think there is no "t = " as default.
library(stargazer)
model <- lm(mpg ~ cyl + disp, data=mtcars)
stargazer(model, type="text")
===============================================
Dependent variable:
---------------------------
mpg
-----------------------------------------------
cyl -1.587**
(0.712)
disp -0.021*
(0.010)
Constant 34.661***
(2.547)
-----------------------------------------------
Observations 32
R2 0.760
Adjusted R2 0.743
Residual Std. Error 3.055 (df = 29)
F Statistic 45.808*** (df = 2; 29)
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01

Stargazer pulls apart variables when observations dropped

I use stargazer to create a table for multiple models. They are actually the same model but the first is based on all observations, while the other drop different observations respectively. All variables are named the same, so what surprises me is that when I export the table to Latex, two lines, one for a dummy variable and another for an interaction term, are duplicated.
What is really strange is that I cannot replicate the results, but I will post a minimal working example nonetheless. Perhaps you can help me based on my description alone.
This is the code for my MWE:
library(tibble)
library(stargazer)
df <- as_tibble(data.frame(first = rnorm(100, 50), second = rnorm(100, 30), third = rnorm(100, 100), fourth = c(rep(0, 50), rep(1, 50))))
model.1 <- lm(first ~ second + third + fourth + third*fourth, data = df)
model.2 <- lm(first ~ second + third + fourth + third*fourth, data = df[!rownames(df) %in% "99",])
stargazer(model.1, model.2)
I will now post the Latex output includes the error that I am trying to fix (with this snippet it seems to work just fine).
What I would like to have, of course is the code as produced by this snippet (I feel very stupid for not being able to reproduce it):
you could take a look at the names of your model's coefficients using coefficients(). Mare sure they are identical, i.e. identical(names(model.1), names(model.2)) Then use stargazer's keep statement to make sure you get the coefficients you want,
Here with the example above keeping selected variables;
coefficients(model.1)
#> (Intercept) second third fourth third:fourth
#> 57.27352606 0.02674072 -0.08236250 20.23596216 -0.20288137
coefficients(model.2)
#> (Intercept) second third fourth third:fourth
#> 57.06149556 0.03305134 -0.08214812 20.85087288 -0.20885718
identical(names(model.1), names(model.2))
#> [1] TRUE
I'm using the type = "text" to make it more friendly to SO, but I guess it's the same with LaTeX,
stargazer(model.1, model.2, type = "text", keep=c("third","third:fourth"))
#>
#> =========================================================
#> Dependent variable:
#> -------------------------------------
#> first
#> (1) (2)
#> ---------------------------------------------------------
#> third -0.082 -0.082
#> (0.166) (0.167)
#>
#> third:fourth -0.203 -0.209
#> (0.222) (0.223)
#>
#> ---------------------------------------------------------
#> Observations 100 99
#> R2 0.043 0.044
#> Adjusted R2 0.002 0.004
#> Residual Std. Error 1.044 (df = 95) 1.047 (df = 94)
#> F Statistic 1.056 (df = 4; 95) 1.089 (df = 4; 94)
#> =========================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
but it might be hard to rule out that it's a local issue if we cannot find a way to reproduce your issue.

How can I omit the regression intercept from my results table in stargazer

I run a regression of the type
model <- lm(y~x1+x2+x3, weights = wei, data=data1)
and then create my table
,t <- stargazer(model, omit="x2", omit.labels="x1")
but I haven't found a way to omit the intercept results from the table. I need it in the regression, yet I don't want to show it in the table.
Is there a way to do it through stargazer?
I haven't your dataset, but typing omit = c("Constant", "x2") should work.
As a reproducible example (stargazer 5.2)
stargazer::stargazer(
lm(Fertility ~ . ,
data = swiss),
type = "text",
omit = c("Constant", "Agriculture"))
Edit: Add in omit.labels
mdls <- list(
m1 = lm(Days ~ -1 + Reaction, data = lme4::sleepstudy),
m2 = lm(Days ~ Reaction, data = lme4::sleepstudy),
m3 = lm(Days ~ Reaction + Subject, data = lme4::sleepstudy)
)
stargazer::stargazer(
mdls, type = "text", column.labels = c("Omit none", "Omit int.", "Omit int/subj"),
omit = c("Constant", "Subject"),
omit.labels = c("Intercept", "Subj."),
keep.stat = "n")
#>
#> ==============================================
#> Dependent variable:
#> ---------------------------------
#> Days
#> Omit none Omit int. Omit int/subj
#> (1) (2) (3)
#> ----------------------------------------------
#> Reaction 0.015*** 0.027*** 0.049***
#> (0.001) (0.003) (0.004)
#>
#> ----------------------------------------------
#> Intercept No No No
#> Subj. No No No
#> ----------------------------------------------
#> Observations 180 180 180
#> ==============================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
Created on 2020-05-08 by the reprex package (v0.3.0)
Note the table should read. This appears to be a bug (stargazer 5.2.2).
#> Intercept No Yes Yes
#> Subj. No No Yes
I got a way of doing it. It is not the most clever way, but works.
I just change the omit command to a keep command. In my example above:
library(stargazer)
model <- lm(y~x1+x2+x3, weights = wei, data=data1)
t <- stargazer(model, keep=c("x1","x3"), omit.labels="x1")
However, it's not an efficient way when you have many variables you want to keep in the regression table

Resources