as_flex_table_ex1 <-
trial %>%
select(trt, age, grade) %>%
tbl_summary(by = trt,missing = "no") %>%
add_p() %>%
as_flex_table() %>%
valign(j = 1,valign = "center")
save_as_docx(as_flex_table_ex1,path = "1.docx")
Now - when I open the table in word, the padding between grade I,II and III is very "airy". How can I reduce this padding? I tried using the padding(as_flex_table_ex1,padding.top=0,padding.bottom = 0,part="all"), but I cant really see a difference in the rendered docx.
So after a bit of research. This seems to do the trick.
as_flex_table_ex1 <-
trial %>%
select(trt, age, grade) %>%
tbl_summary(by = trt,missing = "no") %>%
add_p() %>%
as_flex_table() %>%
valign(j = 1,valign = "center")
as_flex_table_ex2 <- as_flex_table_ex1 %>%
hrule(rule = "exact") %>%
height(height=max(dim_pretty(.,part="body")$heights+0.3*max(dim_pretty(.,part="body")$heights)))
print(as_flex_table_ex2,preview="docx")
The dim_pretty factor multiplied by 0.3 is completely arbitrary. If someone is up to finding a legible height that automatically allows for different fonts and sizes it would be great.
I've run into the same issue. These are the commands I use to try to achieve a more compact table (much of what you've already tried). This is available in the function gtsummary::theme_gtsummary_compact().
x %>%
flextable::fontsize(size = 8, part = "all") %>%
flextable::padding(padding.top = 0, part = "all") %>%
flextable::padding(padding.bottom = 0, part = "all")
I have had more success using a R Markdown Word templates, which helps direct the formatting in the Word doc.
https://bookdown.org/yihui/rmarkdown-cookbook/word-template.html
Related
Is there a way to programmatically select the last row of a table in a pipeline and highlight it with gt?
Highlighting a particular row index is trivial:
library(dplyr)
library(gt)
mtcars %>%
head(5) %>%
gt() %>%
tab_style(
style = list(cell_fill(color = "lightblue")),
locations = cells_body(columns = everything(), rows = 2)
)
But I would like to highlight the last row. Surprisingly, using tidyselect::last_col appears to work, but I do not understand why?
mtcars %>%
head(5) %>%
gt() %>%
tab_style(
style = list(cell_fill(color = "lightblue")),
locations = cells_body(columns = everything(), rows = tidyselect::last_col())
)
I have tried a few approaches like ~ nrow(.) and +Inf, but those fail. In addition, I would like to avoid having the index pre-determined because I would like to keep in a pipeline -- that is, I cannot do something like LAST_ROW <- nrow(mtcars) because of the number of rows is undetermined until the data goes through the pipeline.
This is one way to do it. Simply supply the length of one of the columns. I think it would be better to access the data frame from within the tab_style command, but I don't know if that's possible.
library(dplyr, warn.conflicts = FALSE)
library(gt)
mtcars %>%
head(5) %>%
gt() %>%
tab_style(
style = list(cell_fill(color = "lightblue")),
locations = cells_body(columns = everything(), rows = length(mpg))
)
I'm trying to use kable and kableExtra to format tables created using pipes, and I can't get the conditional formatting arguments (row_spec, column_spec) to accept variables piped from the generated code.
In the toy example below I create a variable called bg within the dataframe that I want to use to create bands of background colour, but row_spec and column_spec don't seem to recognize that as a variable. Note that creating the variable outside of the pipes isn't an option - the actual use case is much more complicated than that, and the variables used in the process don't exist before that.
library(kableExtra)
set.seed(111)
df = data.frame(var1 = sort(sample(LETTERS[1:3],10,TRUE)),
var2 = sample(1:4,10,TRUE),
var3 = runif(10,0,1))
df %>%
mutate(bg = cumsum(!duplicated(var1))%%2)%>%
kable() %>%kable_styling()%>%
column_spec(1,color=bg)
Error in ensure_len_html(color, nrows, "color") : object 'bg' not found
EDIT: You can do it in two steps easily enough, so I'll include that here, as well as the resulting table that I'm looking for
d = df %>%
mutate(bg = cumsum(!duplicated(var1))%%2)
kable(d) %>% kable_styling(full_width=FALSE) %>%
row_spec(which(d$bg==1),background=grey(0.75))
This might work:
library(tidyverse)
library(kableExtra)
set.seed(111)
df = data.frame(var1 = sort(sample(LETTERS[1:3],10,TRUE)),
var2 = sample(1:4,10,TRUE),
var3 = runif(10,0,1))
df %>%
mutate(bg = cumsum(!duplicated(var1))%%2) %>%
mutate(bg = cell_spec(bg, color = spec_color(bg))) %>%
kable(escape = F) %>% kable_styling()
Output:
You may find more information here (page 16): https://haozhu233.github.io/kableExtra/awesome_table_in_pdf.pdf
Version with background color:
df %>%
mutate(bg = cumsum(!duplicated(var1))%%2) %>%
mutate(bg = cell_spec(bg, color = "white", bold = T,
background = spec_color(bg))) %>%
kable(escape = F) %>% kable_styling()
Output with background color:
Hello to all expss experts (#Gregory Demin, if you read this message!), after few days discovering this package, I achieved nice things but still struggle a bit to create complex crosstabs with the tab_* family of functions, especially to create combinations with significance tests.
Let's start with an example given on the reference manual:
library(expss)
mtcars %>%
tab_significance_options(keep = "none", sig_labels = NULL, subtable_marks = "greater", mode = "append") %>%
tab_cols(total(), vs, am) %>%
tab_cells(cyl, gear) %>%
tab_stat_cpct() %>%
tab_last_add_sig_labels() %>%
tab_last_sig_cpct() %>%
tab_last_hstack("inside_columns") %>%
tab_pivot(stat_position = "inside_rows")
From this point, I do not know if the following actions are possible, and if so what scripts would do the trick:
1) It is quite simple with 'fre' function to display counts and percentages side by side, but is limited to this only purpose. How can we add the cases to the crosstab? (in the form of cases/percents/tests, in 3 distinct columns)
2) By default the significance tests output in this example is LETTERS, at 0.05 level. Both parameters can be changed. But is it possible to include two significance levels in a single table calculation? Something in the spirit of:
sig_level = c(0.01, 0.05)
sig_labels = c(LETTERS, letters)
3) Last (probably an easy one?), is there a possibility to force display of zeros? I have factor levels with frequencies=0, displayed with 0s in base R tables. With expss the label stays but the rows/columns remain empty.
Again, maybe what I am looking for does not exist with expss, but at least I will be sure of it.
Thank you!
Your second point (two-level significance) is not possible right now. However you can add second level significance with additional calculations on specially prepared table.
1 and 3 are quite easy:
library(expss)
data(mtcars)
mtcars %>%
tab_significance_options(keep = "none", sig_labels = NULL, subtable_marks = "greater", mode = "append") %>%
tab_cols(total(), vs, am) %>%
tab_cells(cyl, gear) %>%
# block for cases
tab_stat_cases(label = "cases") %>%
tab_last_add_sig_labels() %>%
# block for percent statistic
tab_stat_cpct(label = "%") %>% # percent
tab_last_add_sig_labels() %>%
tab_last_sig_cpct() %>%
tab_pivot(stat_position = "inside_columns") %>%
# converts NA to zero
recode(as.criterion(is.numeric) & is.na ~ 0, TRUE ~ copy)
UPDATE:
You can specify parts of the chain as custom functions to avoid repetition:
library(expss)
data(mtcars)
### tab cols
my_banner = mtcars %>%
tab_cols(total(), vs, am)
### table and formattig
my_custom_table = . %>%
tab_significance_options(keep = "none", sig_labels = NULL, subtable_marks = "greater", mode = "append") %>%
# block for cases
tab_stat_cases(label = "cases") %>%
tab_last_add_sig_labels() %>%
# block for percent statistic
tab_stat_cpct(label = "%") %>% # percent
tab_last_add_sig_labels() %>%
tab_last_sig_cpct() %>%
tab_pivot(stat_position = "inside_columns") %>%
# converts NA to zero
recode(as.criterion(is.numeric) & is.na ~ 0, TRUE ~ copy)
### here we build table
my_banner %>%
tab_cells(cyl, gear) %>%
my_custom_table()
Hei,
To compare several variants of data I produced a HTML report.
Given a special catagory some indexes in the database should be the same. To detect errors / incorrect entries in the database I compare the different categories in a table.
For better reading, it would be fine, to have coloured tables. This can be done easily with the formattable-Package.
My dataset:
require(tidyverse)
require(formattable)
require(kableExtra)
require(knitr)
df1 <- data.frame(V1 = c(68,sample(c("J","N"),size=15,replace = TRUE)),
V2 = c(10,sample(c("J","N"),size=15,replace = TRUE)),
V3 = c(1,sample(c("J","N"),size=15,replace = TRUE))
)
It has - in this example - 3 differnt variants. Only one is recomended. It is supposed, that the variant with the highest N (=first entry in each Vx-Column) is the real one.
My formated table is produced with this code:
df1 %>%
mutate(
V2 = ifelse((as.character(V2) == as.character(V1)) == FALSE,
cell_spec(V2, color = "red",bold = TRUE),
cell_spec(V2, color = "black",bold = FALSE)),
V3 = ifelse((as.character(V3) == as.character(V1)) == FALSE,
cell_spec(V3, color = "red",bold = TRUE),
cell_spec(V3, color = "black",bold = FALSE))
) %>%
kable(format = "html", escape = FALSE) %>%
kable_styling(c("striped", "condensed"), full_width = FALSE) %>%
row_spec(1, bold = T, color = "white", background = "#D7261E")
Two questions:
How to mutate in a loop?
This is necessary because the different categories I have to investigate can have up to 18 different variants. In each dataset, V1 is everytime the reference variant.
As you can see (run the code!) the first line (the "N"s) is coded in the wrong matter. Is it possible to compare from the second line on only (first line is set to TRUE by default)
This would be fine, because the first line is now formated in a matter that does not really make sense.
Thank you!
To answer your two questions:
Instead of looping over the columns, you can use mutate_all
Just take a copy of the first column and mutate it back in later
I have first made your cell_spec calls into functions to reduce clutter in the code.
red <- function(x) cell_spec(x, color = "red", bold = TRUE)
black <- function(x) cell_spec(x, color = "black", bold = FALSE)
c1 <- as.character(df1[[1]])
Now we can do this:
df1 %>%
select(-V1) %>%
mutate_all(function(x) ifelse(as.character(x) != df1[[1]], red(x), black(x))) %>%
mutate(V1 = black(c1)) %>%
mutate_all(function(x) `[<-`(x, 1, " ")) %>%
select(V1, V2, V3) %>%
kable(format = "html", escape = FALSE) %>%
kable_styling(c("striped", "condensed"), full_width = FALSE) %>%
row_spec(1, bold = T, color = "white", background = "#D7261E")
Which gives this result:
Thank you, #AllanCameron!
I 'm not familiar to the package purrr - I really should do more studies about it.
Your idea with purrr::map_dfc solved the problem.
Instead of the first column I need the first row (the digit-row), and of course with grepl it is possible to solve this. The condition in the ifelse-Statement is a little bit longer then.
My final solution is then:
df1 %>%
map_dfc(function(x) ifelse(as.character(x) != as.character(df1$V1) & !grepl("[[:digit:]]",x),
mark_true(x), mark_false(x))) %>%
select(V1, everything()) %>%
kable(format = "html", escape = FALSE) %>%
kable_styling(c("striped", "condensed"), full_width = FALSE) %>%
row_spec(1, bold = T, color = "white", background = "#D7261E")
Thank you very much!
I'm using R version 3.6.1 in RStudio. I have flextable version 0.5.5 and officer version 0.3.5.
I'm having difficulty with formatting my numbers in flextables within RMarkdown. By default, all numbers show up with 3 decimal places. For some of my numbers, this is fine (and actually preferred), but for others, I want to remove the decimals.
Using the advice found here I was able to adjust my table so that all numbers are rounded to the nearest whole number. My code is below (example table used for reproduciblility; otherwise formatting is the same as my current code).
ft_test <- head(iris) %>% flextable() %>%
hline(part = 'header', border = fp_border(color = "black", width = 3)) %>%
align(align ='center', part = 'all') %>%
align(j = 1, align ='left', part = 'all') %>%
set_formatter_type(fmt_double = "%.0f")
ft_test
However, I only want certain columns to be whole numbers, and other columns to still have decimals. I've tried using the j argument to call certain columns:
ft_test <- head(iris) %>% flextable() %>%
hline(part = 'header', border = fp_border(color = "black", width = 3)) %>%
align(align ='center', part = 'all') %>%
align(j = 1, align ='left', part = 'all') %>%
set_formatter_type(fmt_double = "%.0f", j = 2)
ft_test
... but then I get an error telling me j = 2 is an unused argument.
Any suggestions for how to adjust the numbers of only some columns? Thanks in advance for your help!
You can not use argument j as it is not an argument of set_formatter_type. The function is setting formatters for one or several data type. In your case, it's better to use colformat_num.
library(flextable)
library(officer)
library(magrittr)
ft_test <- head(iris) %>% flextable() %>%
hline(part = 'header', border = fp_border(color = "black", width = 3)) %>%
align(align ='center', part = 'all') %>%
align(j = 1, align ='left', part = 'all') %>%
colformat_num(j = c("Sepal.Length", "Sepal.Width",
"Petal.Length", "Petal.Width"), digits = 1)
ft_test
You can learn more about formatting content here: https://davidgohel.github.io/flextable/articles/display.html