Set significant digits in summary_rows() using gt package - r

I'm trying to set number of significant digits inside the summary_rows() function to format mean and sd outputs as following:
library(gt)
library(tidyverse)
iris %>%
gt(groupname_col = "Species") %>%
summary_rows(
groups = TRUE,
columns = c(Sepal.Length:Petal.Width),
fns = list(
Mean = "mean",
SD = "sd"),
formatter = fmt_number(n_sigfig = 1))
But returns that argument "data" is missing.
Any idea?
Thanks!

according to the documentation are the '...' for additional parameters of the formatter:
library(gt)
library(tidyverse)
iris %>%
gt(groupname_col = "Species") %>%
summary_rows(
groups = TRUE,
columns = c(Sepal.Length:Petal.Width),
fns = list(
Mean = "mean",
SD = "sd"),
formatter = fmt_number, n_sigfig = 1)

Related

Is there a way of indenting levels in the new "as_forest_plot" function from gtsummary/bstfun?

Using the excellent GTsummary package to try and create a forest plot of model coefficients. I see an experimental function "as_forest_plot" has been added which works but I can't manage to indent the levels of categorical variables. So far I have tried using the modify_column_indent function but without success, (code and output below), not sure are the names of the rows I am calling incorrect and is it not possible at all yet? My hope is to be able to separate the different variables and also some way of showing which is the reference level.
Many thanks in advance
library(titanic)
library(tidyverse, warn.conflicts = FALSE)
library(gtsummary, warn.conflicts = FALSE)
library(bstfun, warn.conflicts = FALSE)
library(magrittr, warn.conflicts = FALSE)
library(janitor, warn.conflicts = FALSE)
titanic::titanic_train %>%
clean_names %>%
select(-c(name, parch, ticket, cabin, embarked)) %>%
mutate(pclass = factor(pclass)) %>%
glm(survived ~ . - passenger_id, data = ., family = "binomial") %>%
tbl_regression(exponentiate = TRUE) %>%
modify_column_indent(columns = label, rows = (header_row == TRUE)) %>%
modify_cols_merge(
pattern = "{estimate} ({ci})",
rows = !is.na(estimate)
) %>%
modify_header(estimate = "OR (95% CI)") %>%
as_forest_plot(
col_names = c("estimate", "p.value"),
col = forestplot::fpColors(box = "darkred"))
Yes, but you need to manually add the spaces. Example below!
library(gtsummary)
library(bstfun)
trial %>%
select(response, grade) %>%
mutate(grade = paste0(" ", grade)) %>%
tbl_uvregression(
method = glm,
y = response,
method.args = list(family = binomial),
exponentiate = TRUE
) %>%
as_forest_plot()

Flextable: make the signif stars bold and remove the space between stars and the coefficient

I'm testing flextable, a package for R by David Gohel, for the publication workflow and I was wondering about how to format substrings in the cells of the regression tables. In particular, I aim to replicate Stata's outreg2 with significance stars in bold and right next to the coefficient.
Here is the reprex:
library(huxtable)
library(flextable)
lm(formula=mpg~cyl+hp, mtcars) %>% huxreg() %>% as_flextable()
Now I am trying to make the significance stars in bold and right next to the coefficient without the space in between.
If anyone could chime in, it would be great! I'm looking at ftExtra but I see no way how to approach this.
Thank you!
With flextable, you can do it quite easily by adding a signif. column after the p-values column and use 0 padding.
The following code demo how to do it from scratch:
x <- lm(formula=mpg~cyl+hp, mtcars)
library(flextable)
library(broom)
library(magrittr)
pvalue_format <- function(x){
z <- cut(x, breaks = c(-Inf, 0.001, 0.01, 0.05, 0.1, Inf), labels = c("***", "**", "*", ".", ""))
as.character(z)
}
tidy(x) %>%
flextable(
col_keys = c("term", "estimate", "std.error", "statistic", "p.value", "signif")) %>%
colformat_double(digits = 4) %>%
mk_par(j = "signif", value = as_paragraph(pvalue_format(p.value)) ) %>%
set_header_labels(term = "", estimate = "Estimate",
std.error = "Standard Error", statistic = "t value",
p.value = "Pr(>|t|)", signif = "" ) %>%
align(j = "signif", align = "left") %>%
padding(padding.right = 0, j = "p.value", part = "all") %>%
bold(j = "signif", bold = TRUE) %>%
padding(padding.left = 0, j = "signif", part = "all") %>%
autofit()
You can also use the flextable::as_flextable method and adapt the result to your needs:
as_flextable(x) %>%
delete_part(part = "footer") %>%
padding(padding.right = 0, j = "p.value", part = "all") %>%
padding(padding.left = 0, j = "signif", part = "all") %>%
color(i = ~ p.value > .05, j = "p.value", color = "gray") %>%
theme_vanilla()
Here's one possibility:
lm(formula=mpg~cyl+hp, mtcars) |>
huxreg(
stars = c(
'__*__' = 0.05,
'__**__' = 0.01
)
) |>
set_markdown(2:6, 2, TRUE) |>
as_flextable()
This uses markdown to bold the stars.
You may get errors from flextable. If so, you could save as HTML or RTF and convert to Word. (And report the bug.)

Add Chi test to plot_stackfrq in R

I need to add to the plot_stackfrq in R the Chi test, I didn't see any parameter of the function that accepts chi.test or similar.
This is my code:
library(tidyr)
library(data.table)
homes2 <- homes %>% mutate(rn = rowid(Educ_level)) %>%
pivot_wider(names_from = Educ_level, values_from = Insurance)
plot_stackfrq(homes2[,c("High Scholl","College","Elementary")],
geom.colors = c("YlOrBr"))
I could resolve it with plot_xtab, this did show me Chi test as I needed.
plot_xtab(homes2$Educ_label, homes2$Insurance, bar.pos = "stack", show.total = FALSE, margin = "row", coord.flip = TRUE, show.n = FALSE, show.summary = TRUE,
geom.colors = c("YlOrBr"))

Adding a sparkline to a flextable in R

I am trying to use sparkline with a flextable I created for a report. Unfortunately, I can't get the function tabwid() to work, to convert this into an HTML widget. It is cited as being part of flextable v0.4.0, but it isn't in v0.5.
I note in the documentation below that the tabwid() function is deprecated:
https://www.rdocumentation.org/packages/flextable/versions/0.4.0/topics/tabwid
Is there an equivalent function in v0.5? If not, is there another way to use the sparkline function within a sparktable?
I noticed a conversation regarding this with the developer of flextable in 2016, and the code described there includes the tabwid() function. I can't find an alternative and would be grateful for any guidance provided.
https://github.com/davidgohel/flextable/issues/1
The code cited in the link above is as follows:
#devtools::install_github("davidgohel/oxbase")
#devtools::install_github("davidgohel/flextable")
#devtools::install_github("htmlwidgets/sparkline")
library(dplyr) library(oxbase) library(flextable) library(sparkline)
mtcars %>% group_by(cyl) %>% summarise(
hp = spk_chr(
hp, type="box",
chartRangeMin=0, chartRangeMax=max(mtcars$hp)
),
mpg = spk_chr(
mpg, type="box",
chartRangeMin=0, chartRangeMax=max(mtcars$mpg)
) ) %>% flextable() %>% tabwid() %>% spk_add_deps()
If this isn't possible, has anyone else found a way to add a sparkline into a flextable (e.g., through another package?)?
Thanks in advance for any guidance you can provide!
With the dev version on github, you can do the following - these are not interactive graphics:
# remotes::install_github("davidgohel/flextable")
library(data.table)
library(magrittr)
library(flextable)
# data prep ----
z <- as.data.table(iris)
z <- z[, list(
Sepal.Length = mean(Sepal.Length, na.rm = TRUE),
z = list(.SD$Sepal.Length)
), by = "Species"]
# flextable ----
ft <- flextable(
data = z,
col_keys = c("Species", "Sepal.Length", "box", "density", "line")
) %>%
compose(j = "box", value = as_paragraph(
plot_chunk(
value = z, type = "box",
border = "white", col = "transparent",
width = 1.5, height = .4
)
)) %>%
compose(j = "line", value = as_paragraph(
plot_chunk(value = z, type = "line", col = "white",
width = 1.5, height = .4)
)) %>%
compose(j = "density", value = as_paragraph(
plot_chunk(value = z, type = "dens", col = "white",
width = 1.5, height = .4)
)) %>%
theme_vader() %>%
align(j = c("box", "density", "line"), align = "center", part = "all") %>%
autofit()
ft

Display Alternative color_bar value in Formattable Table

Is it possible to populate a formattable color_bar with an alternative display value (i.e. a value other than the value used to determine the size of the color_bar)
In the table below I want to override the values with the following display values for ttl to:
c(1000,1230,1239,1222,1300,1323,1221)
library(tidyverse)
library(knitr)
library(kableExtra)
library(formattable)
tchart <- data.frame(id = 1:7,
Student = c("Billy", "Jane", "Lawrence", "Thomas", "Clyde", "Elizabeth", "Billy Jean"),
grade3 = c(55,70,75,64,62,55,76),
ttl = c(105,120,125,114,112,105,126),
avg =c(52.31,53.0,54.2,51.9,52.0,52.7,53.0))
tchart %>%
mutate(id = cell_spec(id, "html", background = "red", color = "white", align = "center")) %>%
mutate(grade3 = color_bar("lightgreen")(grade3)) %>%
mutate(ttl = color_bar("lightgray")(ttl)) %>%
mutate(avg = color_tile("white","red")(avg)) %>%
kable("html", escape = F) %>%
kable_styling("hover", full_width = F) %>%
column_spec(4, width = "4cm")
I checked the documentation and didn't see this as a possibility, but I was hoping there was a workaround or custom function solution.
I don't think you can quite pass it another set of values, but there are a couple of options that you might find workable.
One thing to note first is that color_bar() can accept two values - a color, and a function that will take the vector of values and transform them to numbers between 0 and 1. By default, that function is formattable::proportion(), which compares everything against the max value. But if you used your display values for ttl, you could conceivably transform the bars to be whatever length you wanted by writing your own function. (See: https://rdrr.io/cran/formattable/man/color_bar.html)
Another possibility would be to make your own formatter. Some examples here:
https://www.littlemissdata.com/blog/prettytables
So, I think you can put the numbers you want in the display, and hopefully can use a function to transform or map those values to get the bar lengths between 0 and 1 that you're looking for.
Add a new variable ttl_bar to determine the size of the bar, and let variable ttl display the value. I use gsub() to replace the ttl_bar to ttl.
tchart <- data.frame(id = 1:7,
Student = c("Billy", "Jane", "Lawrence", "Thomas", "Clyde", "Elizabeth", "Billy Jean"),
grade3 = c(55,70,75,64,62,55,76),
ttl = c(1000,1230,1239,1222,1300,1323,1221),
avg =c(52.31,53.0,54.2,51.9,52.0,52.7,53.0),
ttl_bar = c(105,120,125,114,112,105,126))
tchart %>%
mutate(id = cell_spec(id, "html", background = "red", color = "white", align = "center")) %>%
mutate(grade3 = color_bar("lightgreen")(grade3)) %>%
mutate(avg = color_tile("white","red")(avg)) %>%
mutate(ttl = pmap(list(ttl_bar, ttl, color_bar("lightgray")(ttl_bar)), gsub)) %>%
select(-ttl_bar) %>%
kable("html", escape = F) %>%
kable_styling("hover", full_width = F) %>%
column_spec(4, width = "4cm")
In a more careful way, rewrite gsub() as this mutate(ttl = pmap(list(ttl_bar, ttl, color_bar("lightgray")(ttl_bar)), ~ gsub(paste0(">", ..1, "<"), paste0(">", ..2, "<"), ..3))).
I come up with a better way to use function in color_bar() as the following code.
override = function(x, y) y / 200
tchart <- data.frame(id = 1:7,
Student = c("Billy", "Jane", "Lawrence", "Thomas", "Clyde", "Elizabeth", "Billy Jean"),
grade3 = c(55,70,75,64,62,55,76),
ttl = c(105,120,125,114,112,105,126),
avg =c(52.31,53.0,54.2,51.9,52.0,52.7,53.0),
ttl_bar = c(1000,1230,1239,1222,1300,1323,1221))
tchart %>%
mutate(id = cell_spec(id, "html", background = "red", color = "white", align = "center")) %>%
mutate(grade3 = color_bar("lightgreen")(grade3)) %>%
mutate(avg = color_tile("white","red")(avg)) %>%
mutate(ttl = color_bar("lightgray", fun = override, ttl)(ttl_bar)) %>%
select(-ttl_bar) %>%
kable("html", escape = F) %>%
kable_styling("hover", full_width = F) %>%
column_spec(4, width = "4cm")

Resources