Adding customized options to gtsummary tables - r

I'm trying to figure out how to add customized options when using gtsummary--for example, stars for pvalues, captions, etc.
Here's a reproducible example using base mtcars data, in case that's more efficient...
library(tidyverse)
library(gtsummary)
#> Warning: package 'gtsummary' was built under R version 4.0.3
#> #Uighur
r1 <- lm(mpg ~ wt + cyl, data = mtcars) %>%
tbl_regression(exponentiate = TRUE)
r2 <- lm(hp ~ wt + cyl, data = mtcars) %>%
tbl_regression(exponentiate = TRUE)
r3 <- lm(qsec ~ wt + cyl, data = mtcars) %>%
tbl_regression(exponentiate = TRUE)
tbl_merge(list(r1, r2, r3),
tab_spanner = c("**MPG**", "**Horsepower**", "**Seconds**"))

You can use the add_significance_stars() function to add stars to your estimates. To add titles and other formatting, convert the gtsummary object to gt with the as_gt() function and add them using gt functions.
Example below.
library(gtsummary)
library(tidyverse)
packageVersion("gtsummary")
#> [1] '1.4.0'
# create a tibble with one row per model
tbl <-
tibble(outcome = c("mpg", "hp", "qsec")) %>%
rowwise() %>%
mutate(
tbl =
lm(str_glue("{outcome} ~ wt + cyl"), data = mtcars) %>%
tbl_regression() %>%
add_significance_stars(
hide_se = TRUE,
hide_ci = FALSE
) %>%
list()
) %>%
# pull tbl_regression() objects into single merged table
pull(tbl) %>%
tbl_merge(tab_spanner = c("**MPG**", "**Horsepower**", "**Seconds**")) %>%
# add table captions
as_gt() %>%
gt::tab_header(title = "Table 1. Car Regression Model",
subtitle = "Highly Confidential")
Created on 2021-04-15 by the reprex package (v2.0.0)

Related

How to present interaction variables horizontally in tbl_regression in R

I want to present the coefficient of interaction horizontally rather than vertically with tbl_regression:
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.6.0'
tbl1 <- trial %>%
mutate(early=case_when(stage%in%c("T1","T2")~1,T~0)) %>%
glm(response ~ age * early , family = binomial, data=.) %>%
tbl_regression(
exponentiate = TRUE )
tbl1
tbl2 <- trial %>%
mutate(late=case_when(stage%in%c("T3","T4")~1,T~0)) %>%
glm(response ~ age * late, family = binomial, data=.) %>%
tbl_regression(
exponentiate = TRUE )
tbl2
tbl_stack (list(tbl1,tbl2))
Created on 2022-07-20 by the reprex package (v2.0.1)
I want to have the variables presented horizontally (similar to tbl_merg, but the values are from within the same module), in order to stack different values beneath it.
This requires some customization, but is certainly possible.
library(gtsummary)
#> #BlackLivesMatter
packageVersion("gtsummary")
#> [1] '1.6.1'
tbl1 <-
trial %>%
mutate(early = dplyr::case_when(stage %in% c("T1","T2") ~ 1, TRUE ~ 0)) %>%
glm(response ~ age * early , family = binomial, data=.) %>%
tbl_regression(exponentiate = TRUE)
tbl_final <-
1:3 %>%
purrr::map(
function(i) {
tbl1 %>%
modify_table_body(
~ .x %>%
dplyr::mutate(
label = label[1],
variable = variable[1],
row_type = row_type[1]
) %>%
dplyr::filter(dplyr::row_number() %in% i)
)
}
) %>%
tbl_merge(tab_spanner = c("**Variable**", "**Early**", "**Interaction**"))
Created on 2022-07-20 by the reprex package (v2.0.1)

Creating a tbl_continuous with weight data using gtsummary

Problem:
Can't find a way to create a tbl_continuous of a weighted numerical variable. I'm using tbl_svysummary to create my categorical variable tables but it's not useful when i try to do the same with numerical variable tables. Note: i'm not looking for a general mean of my numeric variable but separated by or groupped by a categorical variable.
Attempt:
For example, i've created this table with the help of the function tbl_continuous which does exactly what i want: the mean of my numeric variable but by the levels of my categorical variable. The only problem is that i can't insert a weight variable into this function.
```{r}
base2 %>%
as_label() %>%
select(ing_cap, ano, nacional, dominio) %>%
tbl_continuous(variable = ing_cap,
by = ano,
statistic = list(everything() ~ "{median}"))
```
Also, i have been creating weighted data with the srvyr package in the following way:
base2 %>%
labelled::drop_unused_value_labels() %>%
as_label() %>%
as_survey_design(weight = fac500a)
Could add up to the solution.
Request:
Create this same table (shown in the image) but with a weight variable. My weight variable in my data is called fac500a.
My data:
My data can be dowloaded from my github repo and has the following dimensions:
> dim(base2)
[1] 108103 44
https://github.com/aito123/quarto_blog/raw/master/posts/tablas_tesis/base2.sav
(dput output is long)
My current packages:
I'm using this r packages so far: tidyverse, srvyr, gtsummary, sjlabelled, haven
Conclusion:
Let me know if it's neccesary to provide more information.
The gtsummary package does not export an analogous function of tbl_continuous() for survey data. But you can construct the table. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.6.0'
svy_trial <- survey::svydesign(ids = ~1, data = trial, weights = ~1)
tbl <-
svy_trial %>%
tbl_strata2(
strata = grade,
~ .x %>%
tbl_svysummary(
by = trt,
include = age,
missing = "no",
label = list(age = .y)
) %>%
modify_header(all_stat_cols() ~ "**{level}**"),
.combine_with = "tbl_stack",
.combine_args = list(group_header = NULL)
) %>%
modify_table_body(
~ .x %>%
mutate(variable = "grade", row_type = "level") %>%
tibble::add_row(
row_type = "label",
variable = "grade",
label = "Grade",
.before = 1L
)
) %>%
modify_column_indent(columns = label, rows = row_type == "level") %>%
bold_labels() %>%
modify_spanning_header(all_stat_cols() ~ "**Treatment**") %>%
modify_footnote(all_stat_cols() ~ "Age: Median (IQR)")
Created on 2022-05-16 by the reprex package (v2.0.1)

How to put the number of observations at the bottom of gtsummary regression table?

I would like to put the number of observations included in set of regression models at the bottom of a gtsummary table, in the same columns as the coefficient estimates. It is straightforward to put the numbers of observations in columns:
library(dplyr)
library(gtsummary)
df <- mtcars %>%
mutate(cyl_miss = if_else(
cyl == 6,
NA_real_,
cyl
))
model_1 <- lm(
data = df,
formula = mpg ~ cyl + disp
)
model_2 <- lm(
data = df,
formula = mpg ~ cyl_miss + disp
)
table_1 <- tbl_regression(model_1) %>%
add_significance_stars(
pattern = "{estimate}{stars}",
thresholds = c(0.001, 0.01, 0.05),
hide_ci = TRUE,
hide_p = TRUE,
hide_se = FALSE
) %>%
add_n()
table_2 <- tbl_regression(model_2) %>%
add_significance_stars(
pattern = "{estimate}{stars}",
thresholds = c(0.001, 0.01, 0.05),
hide_ci = TRUE,
hide_p = TRUE,
hide_se = FALSE
) %>%
add_n()
tbl_merge(
list(table_1, table_2)
)
How can I put the numbers (here 32 and 25) in the Beta columns, in a row labelled "N"?
To add the N to a new row of the table, you'll want to use the add_glance_table() function. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.4.2'
df <-
mtcars %>%
dplyr::mutate(
cyl_miss = ifelse(cyl == 6, NA_real_, cyl)
)
model_1 <- lm(data = df, formula = mpg ~ cyl + disp)
model_2 <- lm(data = df, formula = mpg ~ cyl_miss + disp)
table_1 <-
tbl_regression(model_1) %>%
add_significance_stars() %>%
add_glance_table(include = nobs)
table_2 <-
tbl_regression(model_2) %>%
add_significance_stars() %>%
add_glance_table(include = nobs)
table_final <-
tbl_merge(list(table_1, table_2)) %>%
# ensure the glance statistics are at the bottom of table
modify_table_body(~.x %>% dplyr::arrange(row_type == "glance_statistic"))
Created on 2021-09-14 by the reprex package (v2.0.1)

Maintaining glance footnotes after merge

I’d like to get the same results as those from gtsummary::add_glance_source_note() when creating a gtsummary::tbl_merge().
The function itself takes a tbl_regression for an argument, so there’s no using it in the merge pipeline, and if I add the notes to individual tables, they are lost when tables are merged.
library(tidyverse)
library(gtsummary)
library(nycflights13)
lm_1 <- lm(arr_delay ~ air_time, flights)
tbl_1 <- tbl_regression(lm_1, exponentiate = F) %>%
add_glance_source_note(include = c('r.squared'))
lm_2 <- lm(distance ~ air_time, flights)
tbl_2 <- tbl_regression(lm_2, exponentiate = F) %>%
add_glance_source_note(include = c('r.squared'))
tbl_1
tbl_2
Both tables have footnote indicating their model’s R squared. However, when I merge the tables, the fit information in footnotes is lost:
table.pub <- tbl_merge(
list(tbl_1, tbl_2),
tab_spanner = c("Delay", "Distance")
)
Is there any way to keep the “glance” information, or re-attach it in the final merged table?
Thanks!
UPDATE: As of gtsummary v.1.4.0 this is more easily accomplished using add_glance_table().
library(gtsummary)
library(nycflights13)
packageVersion("gtsummary")
#> [1] '1.4.0'
tbl_1 <-
lm(arr_delay ~ air_time, flights) %>%
tbl_regression(exponentiate = F) %>%
add_glance_table(include = c('r.squared'))
tbl_2 <-
lm(distance ~ air_time, flights) %>%
tbl_regression(exponentiate = F) %>%
add_glance_table(include = c('r.squared'))
tbl <-
tbl_merge(
list(tbl_1, tbl_2),
tab_spanner = c("**Delay**", "**Distance**")
)
Created on 2021-04-15 by the reprex package (v2.0.0)
PREVIOUS RESPONSE:
The glance statistics are added as a source note. The tricky thing is that source notes apply to the entire table. It's entirely clear what the statistics refer to when you have a single tbl_regression() table. But once one or more are merged, it's not clear how the source notes should be presented. For that reason they are not presented after the merge.
But, the note is saved within the gtsummary table, and you can print them. In the example below, I label each of the R2 values by the outcome of the model and add them to the merged table.
Happy Programming!
library(tidyverse)
library(gtsummary)
library(nycflights13)
lm_1 <- lm(arr_delay ~ air_time, flights)
tbl_1 <- tbl_regression(lm_1, exponentiate = F) %>%
add_glance_source_note(include = c('r.squared'))
lm_2 <- lm(distance ~ air_time, flights)
tbl_2 <- tbl_regression(lm_2, exponentiate = F) %>%
add_glance_source_note(include = c('r.squared'))
tbl_1
tbl_2
tbl_merge(
list(tbl_1, tbl_2),
tab_spanner = c("**Delay**", "**Distance**")
) %>%
as_gt() %>%
gt::tab_source_note(
str_glue("Delay {tbl_1$list_output$source_note}; ",
"Distance {tbl_1$list_output$source_note}")
)

Using purrr::pmap() in a rowwise manner outside of mutate()

I am trying to use purrr::pmap() to apply a custom function in a rowwise fashion along some dataframe rows. I can achieve my desired end result with a for-loop and with apply(), but when I try to use pmap() I can only get the result I want in combination with mutate(), which in my real-life applied case will be insufficient.
Is there a way to use pmap() to apply my custom function and just have the output print rather than be stored in a new column?
library(dplyr)
library(purrr)
library(tibble)
Create demo data & custom function
set.seed(57)
ds_mt <-
mtcars %>%
rownames_to_column("model") %>%
mutate(
am = factor(am, labels = c("auto", "manual")),
vs = factor(vs, labels = c("V", "S"))
) %>%
select(model, mpg, wt, cyl, am, vs) %>%
sample_n(3)
foo <- function(model, am, mpg){
print(
paste("The", model, "has a", am, "transmission and gets", mpg, "mpgs.")
)
}
Successful example of rowwise for-loop:
for (row in 1:nrow(ds_mt)) {
foo(
model = ds_mt[row, "model"],
am = ds_mt[row, "am"],
mpg = ds_mt[row, "mpg"]
)
}
Successful example using apply():
row.names(ds_mt) <- NULL # to avoid named vector as output
apply(
ds_mt,
MARGIN = 1,
FUN = function(ds)
foo(
model = ds["model"],
am = ds["am"],
mpg = ds["mpg"]
)
)
Example using pmap() within mutate() that is almost what I need.
ds_mt %>%
mutate(new_var =
pmap(
.l =
list(
model = model,
am = am,
mpg = mpg
),
.f = foo
))
FAILING CODE: Why doesn't this work?
ds_mt %>%
pmap(
.l =
list(
model = model,
am = am,
mpg = mpg
),
.f = foo
)
So after some more reading it seems this is a case for pwalk() rather than pmap(), because I am trying to get output to print (i.e., a side effect) rather than to be stored in a dataframe.
library(dplyr)
library(purrr)
library(tibble)
set.seed(57)
ds_mt <-
mtcars %>%
rownames_to_column("model") %>%
mutate(
am = factor(am, labels = c("auto", "manual")),
vs = factor(vs, labels = c("V", "S"))
) %>%
select(model, mpg, wt, cyl, am, vs) %>%
sample_n(3)
foo <- function(model, am, mpg){
print(
paste("The", model, "has a", am, "transmission and gets", mpg, "mpgs.")
)
}
ds_mt %>%
select(model, am, mpg) %>%
pwalk(
.l = .,
.f = foo
)

Resources