I would like to apply some latex-style formatting to column headings in a pander table in rmarkdown, knitting to pdf.
Notice in the toy document below the latex commands that work for the elements of the dataframe do not work for the headings. Specifically I would like (1) to be able to italicise some headings, (2) to be able to have headings with spaces between the letters (at the moment R automatically adds a .). However I am generally interested in how to get the headings in a dataframe to accept the same latex commands as the elements of the dataframe.
---
title: "Chapter 12: Adding to the Discrete Time Hazard Model"
output:
pdf_document: default
html_document: null
word_document: null
toc: yes
linestretch: 1.3
classoption: fleqn
header-includes:
- \setlength{\mathindent}{0pt}
- \setlength\parindent{0pt}
- \usepackage{amssymb}
---
```{r global_options, include=FALSE, echo = FALSE}
#this sets global knit options (i.e. options for the entire document. The following supresses any warnings from being include in the output and sets plot parameters. Note that setting dev to pdf allows us to set size of graphs easily
rm(list = ls())
knitr::opts_chunk$set(fig.width=12, fig.height=8, fig.path='Figs/',
echo=FALSE, warning=FALSE, message=FALSE, dev = 'pdf')
```
``` {r table p 446}
abC <- 0.3600344
bC <- 0.2455304
intC <- 0.4787285
dfTrans <- data.frame("Prototype" = c("$\\textit{Left/not Blue}$", "Left/Blue", "Right/not Blue", "Right/Blue"),
"$LEFT$" = c(0,1,0,1),
"$\\textit{BLUE}$" = c(0,0,1,1),
`Combined Parameter Estimates` = c(paste("0 x ", round(abC,4), "+ 0 x", round(bC,4), "+ 0 x", round(intC, 4), sep = " "), 8, 9, 0))
library(pander)
panderOptions('table.split.table', 300) # this forces the table to the width of the page.
pander(dfTrans, justify = "left")
```
I'm not sure how to do this with pander, but here is a method using the kable function from knitr and kableExtra functions for detailed table formatting. I haven't changed the yaml markup, but the updated code chunks are pasted in below, followed by the output.
```{r global_options, include=FALSE, echo = FALSE}
#this sets global knit options (i.e. options for the entire document. The following supresses any warnings from being include in the output and sets plot parameters. Note that setting dev to pdf allows us to set size of graphs easily
knitr::opts_chunk$set(fig.width=12, fig.height=8, fig.path='Figs/',
echo=FALSE, warning=FALSE, message=FALSE, dev = 'pdf')
# rm(list = ls()) This is unnecessary. knitr runs the rmarkdown document in a clean session.
library(knitr)
library(kableExtra)
options(knitr.table.format = "latex") # latex output (instead of default html)
library(tidyverse) # For dplyr pipe (%>%) and mutate
```
```{r table p 446}
abC <- 0.3600344
bC <- 0.2455304
intC <- 0.4787285
# I've removed the latex formatting from the data frame code
dfTrans <- data.frame(Prototype = c("Italic_Left/not Blue", "Left/Blue", "Right/not Blue", "Right/Blue"),
LEFT = c(0,1,0,1),
BLUE = c(0,0,1,1),
`Combined Parameter Estimates` = c(paste("0 x ", round(abC,4), "+ 0 x", round(bC,4), "+ 0 x", round(intC, 4), sep = " "), 8, 9, 0))
# Remove periods in column names
names(dfTrans) = gsub("\\.", " ", names(dfTrans))
# Two other options:
# 1. Use the data_frame function from tidyverse, rather than the base data.frame function.
# data_frame doesn't add periods, so you won't need to fix the column names afterwards.
# 2. Set check.names=FALSE in data.frame
# Use kableExtra cell_spec function to format on a cell-by-cell basis
dfTrans = dfTrans %>%
mutate(Prototype = cell_spec(Prototype, color=c("black","blue"),
align=rep(c("l","r"), each=2)))
# Format each of the column names using kableExtra text_spec
names(dfTrans)[1] = text_spec(names(dfTrans)[1], italic=TRUE)
names(dfTrans)[2] = text_spec(names(dfTrans)[2], align="l")
names(dfTrans)[3] = text_spec(names(dfTrans)[3], align="r", italic=TRUE, color="blue")
names(dfTrans)[4] = text_spec(names(dfTrans)[4], align="r")
# Output the table
kable(dfTrans, booktabs=TRUE, escape=FALSE)
```
One thing I'm not sure how to do yet is to format just the first value of dfTrans$Prototype as italic. cell_spec seems to use only the first value of an italic logical vector, so the following italicizes the whole column:
dfTrans = dfTrans %>%
mutate(Prototype = cell_spec(Prototype, color=c("black","blue"),
align=rep(c("l","r"), each=2),
italic=c(TRUE, rep(FALSE, n()-1))))
Here is a huxtable-based solution (my package):
abC <- 0.3600344
bC <- 0.2455304
intC <- 0.4787285
dfTrans <- data.frame(Prototype = c("Italic_Left/not Blue", "Left/Blue", "Right/not Blue", "Right/Blue"),
LEFT = c(0,1,0,1),
BLUE = c(0,0,1,1),
`Combined Parameter Estimates` = c(paste("0 x ", round(abC,4), "+ 0 x", round(bC,4), "+ 0 x", round(intC, 4), sep = " "), 8, 9, 0))
library(huxtable)
huxTrans <- hux(dfTrans, add_colnames = TRUE) # column names become first row
huxTrans[1, 4] <- 'Combined Parameter Estimates' # get rid of the dots
align(huxTrans)[4:5, 1] <- 'right'
text_color(huxTrans)[c(3, 5), 1] <- 'blue'
text_color(huxTrans)[1, 3] <- 'blue'
italic(huxTrans)[1, c(1, 3)] <- TRUE
huxTrans # will automatically become LaTeX in RMarkdown
quick_pdf(huxTrans)
Which looks like this in the terminal:
And this in PDF output:
You can add borders as well if you want.
Related
I want to generate a R markdown html document with plots and it should be possible to jump to a certain plot by search-function (in my example there are 3 plots and I want to jump in the html-doc to the plot, where the main is "rivers").
I think, the problem is, that main and axis labels of a plot are grafical elements, like the plot itself, and not text. So the search-function doesn't work.
Of course it would be possible to add manually text before each plot, but as all my plots are generated with a for-loop, I don_t know how to do it.
is there a possibilty to include text-output in this kind of for-loop or are there other ideas, how the main or axis labels of a plot can be suitable for search-function?
thanks in advance!
---
title: "search function test"
author: "Michel Grün"
date: "last edited `r format(Sys.Date(),'%d.%m.%Y')`"
output:
html_document:
df_print: paged
---
knitr::opts_chunk$set(echo = TRUE,warning = FALSE)
df<-data.frame(x=seq(1,20),
trees=rnorm(20,4,3),
mountains=rnorm(20,6,3),
rivers=rnorm(20,4,4))
for(i in 2:length(colnames(df))){
plot(df$x,df[,i],
main=colnames(df)[i],
xlab=colnames(df)[1],
ylab=colnames(df)[i])
}
solved in another issue: https://stackoverflow.com/a/57034752/16578253
in this issue, the question is slightly different, but a solution shown there was also the solution for my problem. The idea is to create headings + outputs within a loop. As result, in the output dokument every header is followed by a plot and the header is of course suitable for search-function. It's important to use the argument results='asis' in the chunk konfiguration to allow that cat() is interpreted as Markdown syntax. Furthermore the
cat()ing tshould be surrounded by some newlines to make sure it's interpreted properly.
You can combine a svg device with a knitr hook:
---
title: "search function test"
author: "Michel Grün"
date: "last edited `r format(Sys.Date(),'%d.%m.%Y')`"
output:
html_document:
df_print: paged
---
```{r setup}
library(ggplot2)
library(knitr)
# see https://github.com/yihui/knitr/issues/754
local({
hook_plot <- knit_hooks$get("plot")
knit_hooks$set(plot = function(x, options) {
x <- paste(x, collapse = ".")
if (!grepl("\\.svg", x)) {
return(hook_plot(x, options))
}
# read the content of the svg image and write it out without <?xml ... ?>
paste(readLines(x)[-1], collapse = "\n")
})
})
opts_chunk$set(echo = TRUE, warning = FALSE, dev = "svglite")
df <- data.frame(
x = seq(1, 20),
trees = rnorm(20, 4, 3),
mountains = rnorm(20, 6, 3),
rivers = rnorm(20, 4, 4)
)
```
```{r}
for (i in 2:length(colnames(df))) {
plot(df$x, df[, i],
main =paste0(colnames(df)[i], " äöα😋"),
xlab = colnames(df)[1],
ylab = colnames(df)[i]
)
}
```
I'm trying to knit a table in a large Rmarkdown document. The table represent disease levels on a farm and thus some are expressed as percentages percentages. When the disease levels are above target I would like to highlight by changing the color of the font to red. This works fine when text does not have a % label, but when I escape the % label the % label in the PDF document is always proceeded by an unwanted "". I'm a vet not a data scientist I have spent many hours on this but can't find an answer.
The original document is extremely complex pulling data from many sources to generate the current disease levels and targets using shiny input to determine various options so not possible to use code from the original document. But I have produced a very minimal reproducible example below
require(kableExtra)
require(scales)
library(knitr)
require(tidyverse)
knitr::opts_chunk$set(echo = TRUE)
MDF = data.frame(a = label_percent()(.07),b = label_percent()(.05))
MDF$a = as.character(MDF$a)
MDF$b = as.character(MDF$b)
MDF[1,] = apply(MDF[1,],2,function(f) gsub("%", "\\\\%", f))
MDF = MDF %>% mutate(a = cell_spec(a, color = ifelse(a > 6,"red","black")))
kable(MDF, "latex", escape = F, booktabs = T)
As far as I get it the issue is that the percent sign in the conditionally formatted cell gets escaped twice, once via the gsub and once via kable itself so that you end up with an additional \textbackslash() inserted in you latex code. Hence, one option to solve your issue would be to manually escape only the percent signs in the column to which you don't apply the conditional formatting:
---
title: "Untitled"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r}
require(kableExtra)
require(scales)
library(knitr)
require(tidyverse)
MDF = data.frame(a = .07, b = .05) %>%
mutate(across(c(a, b), percent)) %>%
# Escape percent sign in column b only
mutate(across(c(b), ~ gsub("%", "\\\\%", .x))) %>%
mutate(a = cell_spec(a, color = ifelse(a > 6,"red","black")))
kable(MDF, "latex", escape = F, booktabs = T)
```
Within a for loop I decide to create some graphics and related tables.
The tables are created with writeLines() and print().
For creating the graphics I use plot(), boxplot(), mtext(), text(), axis().
So one graphic is created in many steps. Each graphic for it self is complete and nice.
My problem is:
when I knit the markdown document in Rstudio the graphics and tables are not in the correct place.
The first graphic is shown at the place, where the second should be or a little before, after cor.test.default() tells me a warning. Definitively it is shown during the next pass of the for loop.
Conclusion of the Problem
Creating textual output in adition to graphics is mixed up under special circumstances
For reproducing the problem I create some data
All the computations are collected in the "workingChunk"
For demonstrating the problem I use the chunk "loops" at the end.
So read the last chunk first
In the second loop the FIRST graphic is placed in the output
during the SECOND passage of the loop, after the function cor.test() comes up with the warning.
As well the SECOND graphic is placed in the output
during the THIRD passage of the loop, after the function cor.test() comes up with the warning.
and so on.
I found a workarround for this problem but it is not really handy:
When I replace the for loop with single chunk-calls, then the output is in the correct order.
So I'm sure that the reason for the problem is the interaction of the for loop and the function cor.test()
Here is the Example-code (about 140 lines):
---
title: "Test graphic & textual output"
output:
pdf_document: default
word_document:
html_document:
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r creatingData, echo=TRUE}
### {r creatingData, echo=TRUE}
# creating some data
a.df <- data.frame(height=c(1:19),
width=c(21:39)*10,
depht=c(41:59)*20,
group=c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3))
a.df$groupF <- as.factor(a.df$group)
Vars <- names(a.df)[c(1:3)]
```
```{r workingChunk, echo=TRUE, eval=FALSE}
cat("\n===================================================================\n",
"following the calculations for:\n",
"YVar:",YVar, "\n Group:", "group", "\n")
# Daten für Auswertung extrahieren
selvar <- c("group","groupF", YVar)
# Datensätze mit Fehlwerten in der Zielvariablen ausschließen!
a.sub <- a.df[ complete.cases(a.df[ , selvar]), selvar]
# print(str(a.sub))
## Tabelle für gruppierte Zielgrößen
mT <- table(a.sub[ , c("groupF", YVar)])
print(ftable(addmargins(mT))) ## absolute Häufigkeiten
writeLines("\n")
if (runCorTest) {
## calculating Spearmans correlation
myCorTest <- try(cor.test(a.sub[["group"]], a.sub[[YVar]],
method = "spearman", alternative = "two.sided" ))
print(myCorTest)
writeLines("\n")
}
## preparing the grafic
GL.x1 <- levels(a.sub[["groupF"]]) ## grouplabels
# Calculating the range of Y
my.ylim <- c(min(a.sub[[YVar]], na.rm = TRUE), max(a.sub[[YVar]], na.rm = TRUE))
at.x <- c(1:length(GL.x1)) ## Labelpositionen anlegen
G.data <- vector("list", length(GL.x1)) ## Vektoren für die Daten der Gruppen anlegen
# Daten der Gruppen herausziehen
G.data <- split(a.sub[[YVar]], a.sub["groupF"])
# print(str(G.data))
## drawing emtpy plot
cat("\n\n>>> Here should be placed the Grafik for:",YVar, "<<<\n")
plot( x = NA, y = NA, type = "n",
main = YVar,
xlim = c( 1, length( GL.x1)) + c( -0.6, 0.6),
ylim = my.ylim,
xlab = NA, ylab = NA,
axes = FALSE, frame.plot = TRUE)
# X-axis
axis( 1, las = 1, labels = FALSE)
mtext(GL.x1, at = at.x, cex=0.8, side = 1, line = .3)
# Y-axis
axis( 2, las = 1)
## drawing the data
for (i in 1:length(G.data)){
boxplot(G.data[i], # col = "white",
at = at.x[i], range = 0, add = TRUE,
boxwex = 0.6, yaxs = "i", axes = FALSE)
}
```
```{r, loops, echo=FALSE}
cat("\n===================================================================",
"\n===================================================================\n",
"calling the workingChunk within a for loop without executing cor.test()",
"\n works fine!!",
"\n===================================================================",
"\n===================================================================\n")
runCorTest <- FALSE
for ( YVar in Vars) {
<<workingChunk>>
}
cat("\n===================================================================",
"\n===================================================================\n",
"calling the workingChunk within a for loop with executing cor.test() ",
"\n mixes up the textual output and the graphics!!",
"\n===================================================================",
"\n===================================================================\n")
runCorTest <- TRUE
for ( YVar in Vars) {
<<workingChunk>>
}
cat("\n===================================================================",
"\n===================================================================\n",
"calling the workingChunk with executing cor.test() ",
"\n workarround without a for loop works fine!!",
"\n===================================================================",
"\n===================================================================\n")
runCorTest <- TRUE
YVar <- Vars[1]
<<workingChunk>>
YVar <- Vars[2]
<<workingChunk>>
YVar <- Vars[3]
<<workingChunk>>
```
Not a real answer but a much shorter test case and a workaround:
---
title: "Test graphic & textual output"
output:
html_document:
df_print: paged
word_document: default
pdf_document:
keep_tex: yes
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r}
foo <- function(letter, warn) {
cat("Doing", letter, "\n")
print(letter)
if (warn) warning(letter)
cat("Graphic for", letter, "goes here", "\n")
plot(1, main = letter)
}
```
# with loop and with warning
```{r}
for (letter in letters[1:3])
foo(letter, TRUE)
```
# with loop and without warning
```{r}
for (letter in letters[1:3])
foo(letter, FALSE)
```
# without loop and with warning
```{r}
foo("a", TRUE)
foo("b", TRUE)
foo("c", TRUE)
```
# with loop and with suppressed warning
```{r warning=FALSE}
for (letter in letters[1:3])
foo(letter, TRUE)
```
Besides removing all of the data processing, I have also switched from a named chunk to a function, which I find more idiomatic in R. That does not change the strange ordering, though.
If the warning message is not important in your case, you can use the workaround shown in the end: Suppressing the warning with the chunk option warning=FALSE restores the ordering.
Say I'd like to display a table of coefficients from several equations in an R Markdown file (html output).
I'd like the table to look somewhat like this:
But I can't for the life of me figure out how to tell R Markdown to parse the column names in the table.
The closest I've gotten is a hacky solution using cat to print custom table from my data.frame... not ideal. Is there a better way to do this?
Here's how I created the image above, saving my file as an .Rmd in RStudio.
---
title: "Math in R Markdown tables"
output:
html_notebook: default
html_document: default
---
My fancy table
```{r, echo=FALSE, include=TRUE, results="asis"}
# Make data.frame
mathy.df <- data.frame(site = c("A", "B"),
b0 = c(3, 4),
BA = c(1, 2))
# Do terrible things to print it properly
cat("Site|$\\beta_0$|$\\beta_A$")
cat("\n")
cat("----|---------|---------\n")
for (i in 1:nrow(mathy.df)){
cat(as.character(mathy.df[i,"site"]), "|",
mathy.df[i,"b0"], "|",
mathy.df[i,"BA"],
"\n", sep = "")
}
```
You can use kable() and its escape option to format math notation (see this answer to a related question). Then you assign your mathy headings as the column names, and there you go:
---
title: "Math in R Markdown tables"
output:
html_notebook: default
html_document: default
---
My fancy table
```{r, echo=FALSE, include=TRUE, results="asis"}
library(knitr)
mathy.df <- data.frame(site = c("A", "B"),
b0 = c(3, 4),
BA = c(1, 2))
colnames(mathy.df) <- c("Site", "$\\beta_0$", "$\\beta_A$")
kable(mathy.df, escape=FALSE)
```
I would like to create an automated knitr report that will produce histograms for each numeric field within my dataframe. My goal is to do this without having to specify the actual fields (this dataset contains over 70 and I would also like to reuse the script).
I've tried a few different approaches:
saving the plot to an object, p, and then calling p after the loop
This only plots the final plot
Creating an array of plots, PLOTS <- NULL, and appending the plots within the loop PLOTS <- append(PLOTS, p)
Accessing these plots out of the loop did not work at all
Even tried saving each to a .png file but would rather not have to deal with the overhead of saving and then re-accessing each file
I'm afraid the intricacies of the plot devices are escaping me.
Question
How can I make the following chunk output each plot within the loop to the report? Currently, the best I can achieve is output of the final plot produced by saving it to an object and calling that object outside of the loop.
R markdown chunk using knitr in RStudio:
```{r plotNumeric, echo=TRUE, fig.height=3}
suppressPackageStartupMessages(library(ggplot2))
FIELDS <- names(df)[sapply(df, class)=="numeric"]
for (field in FIELDS){
qplot(df[,field], main=field)
}
```
From this point, I hope to customize the plots further.
Wrap the qplot in print.
knitr will do that for you if the qplot is outside a loop, but (at least the version I have installed) doesn't detect this inside the loop (which is consistent with the behaviour of the R command line).
Wish to add a quick note:
Somehow I googled the same question and get into this page.
Now in 2018, just use print() in the loop.
for (i in 1:n){
...
f <- ggplot(.......)
print(f)
}
I am using child Rmd files in markdown, also works in sweave.
in Rmd use following snippet:
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
```
da-numeric.Rmd looks like:
Variabele `r num_var_names[i]`
------------------------------------
Missing : `r sum(is.na(data[[num_var_names[i]]]))`
Minimum value : `r min(na.omit(data[[num_var_names[i]]]))`
Percentile 1 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2]`
Percentile 99 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]`
Maximum value : `r max(na.omit(data[[num_var_names[i]]]))`
```{r results='asis', comment="" }
warn_extreme_values=3
d1 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[1]
d99 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[101] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]
if(d1){cat('Warning : Suspect extreme values in left tail')}
if(d99){cat('Warning : Suspect extreme values in right tail')}
```
``` {r eval=TRUE, fig.width=6, fig.height=2}
library(ggplot2)
v <- num_var_names[i]
hp <- ggplot(na.omit(data), aes_string(x=v)) + geom_histogram( colour="grey", fill="grey", binwidth=diff(range(na.omit(data[[v]]))/100))
hp + theme(axis.title.x = element_blank(),axis.text.x = element_text(size=10)) + theme(axis.title.y = element_blank(),axis.text.y = element_text(size=10))
```
see my datamineR package on github
https://github.com/hugokoopmans/dataMineR
As an addition to Hugo's excellent answer, I believe that in 2016 you need to include a print command as well:
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
`r paste(out, collapse = '\n')`
```
For knitting Rmd to HTML, I find it more convenient to have a list of figures. In this case I get the desirable output with results='hide' as follows:
---
title: "Make a list of figures and show it"
output:
html_document
---
```{r}
suppressPackageStartupMessages({
library(ggplot2)
library(dplyr)
requireNamespace("scater")
requireNamespace("SingleCellExperiment")
})
```
```{r}
plots <- function() {
print("print")
cat("cat")
message("message")
warning("warning")
# These calls generate unwanted text
scater::mockSCE(ngene = 77, ncells = 33) %>%
scater::logNormCounts() %>%
scater::runPCA() %>%
SingleCellExperiment::reducedDim("PCA") %>%
as.data.frame() %>%
{
list(
f12 = ggplot(., aes(x = PC1, y = PC2)) + geom_point(),
f22 = ggplot(., aes(x = PC2, y = PC3)) + geom_point()
)
}
}
```
```{r, message=FALSE, warning=TRUE, results='hide'}
plots()
```
Only the plots are shown and the warnings (which you can switch off, as well).