produce contingency table in rmarkdown using kable - r

I am trying to produce a well formatted contingency table in an rmarkdonw html document. Here is the code:
---
title: "Probabilidad"
author: "Nicolás Molano Gonzalez"
date: "7 de Abril de 2020"
output:
html_document:
fig_caption: true
---
```{r echo=F, message = FALSE, warning =F}
library(tidyverse)
library(kableExtra)
library(knitr)
set.seed(150)
```
this the data for the table:
```{r echo=FALSE, results = 'asis'}
ca_ctr_r<-.3
n <- 250
nCA <- round(n*ca_ctr_r)
z0 <- data.frame(status=c(rep("CA",nCA),rep("CTR",n-nCA)))
z0$exposition <- NA
exp_CA <- .45
exp_CTR <- .19
z0[z0$status %in% "CA","exposition"] <- ifelse(runif(nCA) < exp_CA,"yes","no")
z0[z0$status %in% "CTR","exposition"] <- ifelse(runif(n-nCA) < exp_CA,"yes","no")
z0$exposition <- factor(z0$exposition,levels = c("yes","no"))
```
here is the code to print the contingency table, which should be improved.
```{r echo=FALSE, results = 'asis'}
res <- kable(t(table(z0)%>%addmargins))
#res <- kable(t(table(z0)))
kable_styling(res,"striped", position = "center",full_width = F) %>% add_header_above(c("exposition","status"=2," "))
```
I want the output of the code to be similar to that of base R namely:
status
exposition CA CTR Sum
no 40 96 136
yes 35 79 114
Sum 75 175 250
add_header_above lets me get the title for the columns but I am struggling to achieve the title for the rows (exposition) in the right position.

I tried a workaround by explicitly adding a column to the left of the table before passing it to kable.
library(tidyverse)
library(kableExtra)
library(knitr)
cont.table = mtcars %>% select(gear, carb) %>%
group_by_all() %>% tally() %>%
spread(key = gear, value = n)
cont.table %>%
rename("\t" = carb) %>%
add_column(" " = c("carb", rep(" ", nrow(.) - 1)), .before = "\t") %>%
kable() %>%
kable_styling(position = "left", full_width = F, ) %>%
add_header_above(c("", "", "gear", rep(" ", ncol(cont.table) - 2))) %>%
column_spec(1:2, bold = TRUE)

Related

RMarkdown: Color largest percentage in each row of Kable?

How can RMarkdown correctly color the largest percentage in each row of Kable? The code below incorrectly colors the cell based on the descend order from the 1st digit of the percentages. Thank you in advance!
Code:
---
title: "Color Max Percentage"
output:
html_document: default
pdf_document: default
---
```{r setup, include = F}
library(tidyverse)
library(knitr)
library(kableExtra)
options(knitr.table.format = "html")
df = data.frame(
x = c(1, 2, 3, 4, 5),
a = c("12.7%", "14.0%", "49.2%", "20.4%", "23.2%"),
b = c("35.6%", "19.0%", "9.1%", "25.5%", "11.2%"),
c = c("6.9%", "54.1%", "31.3%", "15.4%", "17.5%")
)
df <- df %>%
# adorn_totals('row') %>%
rowwise() %>%
mutate(across(a:c, ~cell_spec(.x, format = "html",
color = ifelse(.x == max(c_across(a:c)), "red", "blue"))))
df %>%
kable(escape = F) %>%
kable_styling()
```
(Incorrect) output:
You are trying to take the maximum of character vectors (i.e. c("12.7%", "35.6%", "6.9%")) and in R,
max(c("12.7%", "35.6%", "6.9%"))
#> [1] "6.9%"
and from ?max and ?comparison,
Character versions are sorted lexicographically, and this depends on the collating sequence of the locale in use: the help for ‘Comparison’ gives details.
Character strings can be compared with different marked encodings (see Encoding): they are translated to UTF-8 before comparison.
sort(c("12.7%", "35.6%", "6.9%"), decreasing = TRUE)
#> [1] "6.9%" "35.6%" "12.7%"
So, we need to convert them to numbers before comparing using readr::parse_number() and to print the cell values with percent format we can use formattable::percent() function.
---
title: "Color Max Percentage"
output:
html_document: default
pdf_document: default
---
```{r setup, include = F}
library(tidyverse)
library(knitr)
library(kableExtra)
options(knitr.table.format = "html")
df = data.frame(
x = c(1, 2, 3, 4, 5),
a = c("12.7%", "14.0%", "49.2%", "20.4%", "23.2%"),
b = c("35.6%", "19.0%", "9.1%", "25.5%", "11.2%"),
c = c("6.9%", "54.1%", "31.3%", "15.4%", "17.5%")
)
df <- df %>%
# adorn_totals('row') %>%
mutate(across(a:c, ~ readr::parse_number(.x) / 100)) %>%
rowwise() %>%
mutate(across(
a:c,
~ cell_spec(
formattable::percent(.x, digits = 1),
format = "html",
color = ifelse(.x == max(c_across(a:c)), "red", "blue")
)
))
df %>%
kable(escape = F) %>%
kable_styling()
```

Why using kableExtra library wrongly formats output of RMarkdown to Word

If I uncomment kableExtra library the Word output becomes wrongly formatted but htmloutput is always right. Is kableExtra compatible with knitting to Word?
title: "kableExtra2Word"
output:
word_document: default
html_document: default
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(knitr)
#library(kableExtra)
library(janitor)
MT <- tibble(Session = c(1,1,1,1,1,1,2,2,2,2),
scores = rep('Not', 10),
B_A = rep('A', 10))
t <- MT %>%
tabyl(Session, scores, B_A, show_missing_levels = FALSE) %>%
adorn_totals(where = "col")
kable(t)
kableExtra doesn't seem to be compatible with Word output, you may try other alternatives. Example - flextable::regulartable
```{r}
MT <- tibble(Session = c(1,1,1,1,1,1,2,2,2,2),
scores = rep('Not', 10),
B_A = rep('A', 10))
t <- MT %>%
tabyl(Session, scores, B_A, show_missing_levels = FALSE) %>%
adorn_totals(where = "col")
flextable::regulartable(t$A)

rmarkdown: kable, xtable or tab_df tables in Word doc

---
output:
word_document: default
---
```{r setup, include=FALSE}
data("mtcars")
library(tidyverse)
library(xtable)
library(sjPlot)
library(kableExtra)
```
```{r, results='asis'}
df <- mtcars %>%
group_by(cyl) %>%
summarise(disp = mean(disp),
wt = mean(wt),
n = n()
)
kable(df)
# tab_df(df)
# xtable(df)
```
I have tried xtable, tab_df, and kable to generate a word document with a table. When "knit to HTML document", all tables looked fine. When "knit to Word", xtable didn't show the table while tab_df and kable produced a table with only one column:
kable(df)
cyl
disp
wt
n
4
105.1364
2.285727
I experimented with flextable quite a bit the last few days and it might be the best option when you have to work with Word:
---
output:
word_document: default
---
```{r setup, include=FALSE}
data("mtcars")
library(tidyverse)
library(flextable)
```
```{r, results='asis'}
mtcars %>%
group_by(cyl) %>%
summarise(disp = mean(disp),
wt = mean(wt),
n = n()
) %>%
flextable() %>%
align(part = "all") %>% # left align
set_caption(caption = "Table 1: Example") %>%
font(fontname = "Calibri (Body)", part = "all") %>%
fontsize(size = 10, part = "body") %>%
# add footer if you want
# add_footer_row(values = "* p < 0.05. ** p < 0.01. *** p < 0.001.",
# colwidths = 4) %>%
theme_booktabs() %>% # default theme
autofit()
```

Add "greater than and equal" symbol on latex table using kable and group_rows

As title, I am trying to use the group_rows function to tidy up my table as shown below, I have add the <= symbol at column 5 (i.e. <=rowid), but the symbol cannot be shown correctly when the column is used for group_rows, can anyone help? Thanks!
---
output:
pdf_document:
keep_tex: true
header-includes:
- \usepackage{colortbl}
- \usepackage{tikz}
papersize: a4
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
library(dplyr)
library(knitr)
library(kableExtra)
knitr::opts_chunk$set(warning=FALSE, message=FALSE, echo=FALSE)
options(kableExtra.latex.load_packages = FALSE)
```
```{r cars, results='asis'}
data.df <- iris %>%
data.frame %>%
group_by(Species) %>%
filter(row_number()<=3) %>%
mutate(rowid=1:n()) %>%
ungroup %>%
mutate(Species=as.character(Species)) %>%
mutate(Species=paste0('$\\geq$',Species)) %>%
mutate(rowid=paste0('$\\geq$',rowid)) %>%
rename('$\\geq$rowid'='rowid')
data.df %>%
select(-Species) %>%
kable(.,format = 'latex',booktabs=TRUE,escape = FALSE,longtable=TRUE) %>%
group_rows(index = auto_index(data.df$Species)) %>%
kable_styling(latex_options = c('repeat_header','striped','HOLD_position'))
```
The grouped row headers are put inside \textbf{} statement and somehow, extra text sanitation is done in the process. If you use escape = T inside group_rows and add extra backslashes it works:
data.df <- iris %>%
data.frame %>%
group_by(Species) %>%
filter(row_number()<=3) %>%
mutate(rowid=1:n()) %>%
ungroup %>%
mutate(Species=as.character(Species)) %>%
mutate(Species=paste0('$\\\\geq$', Species)) %>% # extra backslashes
mutate(rowid=paste0('$\\geq$',rowid)) %>%
rename('$\\geq$rowid'='rowid')
data.df %>%
select(-Species) %>%
kable(., format = 'latex', booktabs=TRUE, escape = FALSE, longtable=TRUE) %>%
group_rows(index = auto_index(data.df$Species), escape = F) %>% # escape = F
kable_styling(latex_options = c('repeat_header','striped','HOLD_position'))

Combining cells in R, LaTex

Can I have two summary statistics in one cell in R-generated LaTex tables? I'm using the tables package and I'd like to summarize a column as mean (sd). Here's a reproducible example in rmarkdown.
---
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(tables)
library(dplyr)
```
```{r table, results='asis'}
seed <- 1
iris2 <- iris %>% mutate(Region = factor(sample(c('East', 'West', 'Central'), 150, replace = TRUE)))
tabular((Species + 1) ~ (Region + 1) * Sepal.Length * (mean + sd),
data = iris2) %>%
latex %>%
print
```
The output looks like this:
But I want the cells to look like e.g. 5.037 (0.3041). Is that possible?
You can probably do something you want using paste()
E.g.:
paste(round(mean[1], digits=3), "(", sd[1], ")", collapse="")
The Paste pseudo-function does the trick, but introduces unnecessary padding (see: Remove padding from pasted cells in LaTex, R). Example code:
---
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(tables)
library(dplyr)
```
```{r table, results='asis'}
seed <- 1
iris2 <- iris %>% mutate(Region = factor(sample(c('East', 'West', 'Central'), 150, replace = TRUE)))
tabular((Species + 1) ~ (Region + 1) * Sepal.Length * Paste(Percent(), length, sep = '\\% (', postfix = ')'),
data = iris2) %>%
latex %>%
print
```

Resources