Different pander performance between .Rmd and .Rnw

Different pander performance between .Rmd and .Rnw - r

I'm having some trouble getting pander to work the same way in a .Rnw file as it does in my .Rmd file. In both, I'm using knitr to weave the pdf. My .Rmd file looks like
---
title: "My title"
output: pdf_document
---
```{r}
library(pander)
panderOptions('big.mark', ',')
panderOptions('table.style','rmarkdown')
panderOptions('table.alignment.default','right')
panderOptions('table.alignment.rownames','left')
panderOptions('digits','10')
df <- data.frame('a' = 1:3, 'b' = 4:6, 'c' = 7:9)
pander(df)
```
The dataframe looks really nice in the output as it converts the dashed lines into solid ones. However when I try to do something similar in my .Rnw file, everything is printed as if it were a sentence rather than a table.
\documentclass{article}
\begin{document}
<<model_data>>=
library(pander)
panderOptions('big.mark', ',')
panderOptions('table.style','rmarkdown')
panderOptions('table.alignment.default','right')
panderOptions('table.alignment.rownames','left')
panderOptions('digits','10')
df <- data.frame('a' = 1:3, 'b' = 4:6, 'c' = 7:9)
pander(df)
#
\end{document}
This looks terrible, I can call pandoc.table(df) instead and then its printed in a table format at least but it has the dashed lines unlike when its printed in the .Rmd file. How to I get it to print exactly like it does in the .Rmd file but inside of an .Rnw file?

Related

Run R Markdown on many different datasets and save each knitted word document separately

I created an R Markdown to check for errors in a series of datasets (e.g., are there any blanks in a given column? If so, then print a statement that there are NAs and which rows have the NAs). I have setup the R Markdown to output a bookdown::word_document2. I have about 100 datasets that I need to run this same R Markdown on and get a word document output for each separately.
Is there a way to run this same R Markdown across all of the datasets and get a new word document for each (and so they are not overwritten)? All the datasets are in the same directory. I know that the output is overwritten each time you knit the document; thus, I need to be able to save each word document according to the dataset/file name.
Minimal Example
Create a Directory with 3 .xlsx Files
library(openxlsx)
setwd("~/Desktop")
dir.create("data")
dataset <-
structure(
list(
name = c("Andrew", "Max", "Sylvia", NA, "1"),
number = c(1, 2, 2, NA, NA),
category = c("cool", "amazing",
"wonderful", "okay", NA)
),
class = "data.frame",
row.names = c(NA,-5L)
)
write.xlsx(dataset, './data/test.xlsx')
write.xlsx(dataset, './data/dataset.xlsx')
write.xlsx(dataset, './data/another.xlsx')
RMarkdown
---
title: Hello_World
author: "Somebody"
output:
bookdown::word_document2:
fig_caption: yes
number_sections: FALSE
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
setwd("~/Desktop")
library(openxlsx)
# Load data for one .xlsx file. The other datasets are all in "/data".
dataset <- openxlsx::read.xlsx("./data/test.xlsx")
```
# Test for Errors
```{r test, echo=FALSE, comment=NA}
# Are there any NA values in the column?
suppressWarnings(if (TRUE %in% is.na(dataset$name)) {
na.index <- which(is.na(dataset$name))
cat(
paste(
"– There are NAs/blanks in the name column. There should be no blanks in this column. The following row numbers in this column need to be corrected:",
paste(na.index, collapse = ', ')
),
".",
sep = "",
"\n",
"\n"
)
})
```
So, I would run this R Markdown with the first .xlsx dataset (test.xlsx) in the /data directory, and save the word document. Then, I would want to do this for every other dataset listed in the directory (i.e., list.files(path = "./data") and save a new word document. So, the only thing that would change in each RMarkdown would be this line: dataset <- openxlsx::read.xlsx("./data/test.xlsx"). I know that I need to set up some parameters, which I can use in rmarkdown::render, but unsure how to do it.
I have looked at some other SO entries (e.g., How to combine two RMarkdown (.Rmd) files into a single output? or Is there a way to generate a cached version of an RMarkdown document and then generate multiple outputs directly from the cache?), but most focus on combining .Rmd files, and not running different iterations of the same file. I've also looked at Passing Parameters to R Markdown.
I have also tried the following from this. Here, all the additions were added to the example R Markdown above.
Added this to the YAML header:
params:
directory:
value: x
Added this to the setup code chunk:
# Pull in the data
dataset <- openxlsx::read.xlsx(file.path(params$directory))
Then, finally I run the following code to render the document.
rmarkdown::render(
input = 'Hello_World.Rmd'
, params = list(
directory = "./data"
)
)
However, I get the following error, although I only have .xlsx files in /data:
Quitting from lines 14-24 (Hello_World.Rmd) Error: openxlsx can only
read .xlsx files
I also tried this on my full .Rmd file and got the following error, although the paths are exactly the same.
Quitting from lines 14-24 (Hello_World.Rmd) Error in file(con,
"rb") : cannot open the connection
*Note: Lines 14–24 are essentially the setup section of the .Rmd.
I'm unsure of what I need to change. I also need to generate multiple output files, using the original filename (like "test" from test.xlsx, "another" from another.xlsx, etc.)

You could call render in a loop to process each file passed as a parameter :
dir_in <- 'data'
dir_out <- 'result'
files <- file.path(getwd(),dir_in,list.files(dir_in))
for (file in files) {
print(file)
rmarkdown::render(
input = 'Hello_World.Rmd',
output_file = tools::file_path_sans_ext(basename(file)),
output_dir = dir_out,
params = list(file = file)
)
}
Rmarkdown :
---
title: Hello_World
author: "Somebody"
output:
bookdown::word_document2:
fig_caption: yes
number_sections: FALSE
params:
file: ""
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(openxlsx)
# Load data for one .xlsx file. The other datasets are all in "/data".
dataset <- openxlsx::read.xlsx(file)
```
# Test for Errors
```{r test, echo=FALSE, comment=NA}
# Are there any NA values in the column?
suppressWarnings(if (TRUE %in% is.na(dataset$name)) {
na.index <- which(is.na(dataset$name))
cat(
paste(
"– There are NAs/blanks in the name column. There should be no blanks in this column. The following row numbers in this column need to be corrected:",
paste(na.index, collapse = ', ')
),
".",
sep = "",
"\n",
"\n"
)
})
```

An alternative using purrr rather than the for loop, but using the exact same setup as #Waldi.
Render
dir_in <- 'data'
dir_out <- 'result'
files <- file.path(getwd(),dir_in,list.files(dir_in))
purrr::map(.x = files, .f = function(file){
rmarkdown::render(
input = 'Hello_World.Rmd',
output_file = tools::file_path_sans_ext(basename(file)),
output_dir = dir_out,
params = list(file = file)
)
})
Rmarkdown
---
title: Hello_World
author: "Somebody"
output:
bookdown::word_document2:
fig_caption: yes
number_sections: FALSE
params:
file: ""
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(openxlsx)
# Load data for one .xlsx file. The other datasets are all in "/data".
dataset <- openxlsx::read.xlsx(file)
```
# Test for Errors
```{r test, echo=FALSE, comment=NA}
# Are there any NA values in the column?
suppressWarnings(if (TRUE %in% is.na(dataset$name)) {
na.index <- which(is.na(dataset$name))
cat(
paste(
"– There are NAs/blanks in the name column. There should be no blanks in this column. The following row numbers in this column need to be corrected:",
paste(na.index, collapse = ', ')
),
".",
sep = "",
"\n",
"\n"
)
})
```

Run selected chunks from one Rmd in another

I've run my analyses in a source Rmd file and would like to knit a clean version from a final Rmd file using only a few of the chunks from the source. I've seen a few answers with regard to pulling all of the chunks from a source Rmd in Source code from Rmd file within another Rmd and How to source R Markdown file like `source('myfile.r')`?. I share the concern with these posts in that I don't want to port out a separate .R file, which seems to be the only way that read_chunk works.
I think I'm at the point where I can import the source Rmd, but now I'm not sure how to call specific chunks from it in the final Rmd. Here's a reproducible example:
SourceCode.Rmd
---
title: "Source Code"
output:
pdf_document:
latex_engine: xelatex
---
```{r}
# Load libraries
library(knitr) # Create tables
library(kableExtra) # Table formatting
# Create a dataframe
df <- data.frame(x = 1:10,
y = 11:20,
z = 21:30)
```
Some explanatory text
```{r table1}
# Potentially big block of stuff I don't want to have to copy/paste
# But I want it in the final document
kable(df, booktabs=TRUE,
caption="Big long title for whatever") %>%
kable_styling(latex_options=c("striped","HOLD_position")) %>%
column_spec(1, width="5cm") %>%
column_spec(2, width="2cm") %>%
column_spec(3, width="3cm")
```
[Some other text, plus a bunch of other chunks I don't need for anyone to see in the clean version.]
```{r}
save(df, file="Source.Rdata")
```
FinalDoc.Rmd
---
title: "Final Doc"
output:
pdf_document:
latex_engine: xelatex
---
```{r setup, include=FALSE}
# Load libraries and data
library(knitr) # Create tables
library(kableExtra) # Table formatting
opts_chunk$set(echo = FALSE)
load("Source.Rdata")
```
As far as I can tell, this is likely the best way to load up SourceCode.Rmd (from the first linked source above):
```{r}
options(knitr.duplicate.label = 'allow')
source_rmd2 <- function(file, local = FALSE, ...){
options(knitr.duplicate.label = 'allow')
tempR <- tempfile(tmpdir = ".", fileext = ".R")
on.exit(unlink(tempR))
knitr::purl(file, output=tempR, quiet = TRUE)
envir <- globalenv()
source(tempR, local = envir, ...)
}
source_rmd2("SourceCode.Rmd")
```
At this point, I'm at a loss as to how to call the specific chunk table1 from SourceCode.Rmd. I've tried the following as per instructions here with no success:
```{r table1}
```
```{r}
<<table1>>
```
The first seems to do nothing, and the second throws an unexpected input in "<<" error.

I wrote a function source_rmd_chunks() that sources chunk(s) by label name. See gist.

Display a data.frame with mathematical notation in table header R Markdown html output

Say I'd like to display a table of coefficients from several equations in an R Markdown file (html output).
I'd like the table to look somewhat like this:
But I can't for the life of me figure out how to tell R Markdown to parse the column names in the table.
The closest I've gotten is a hacky solution using cat to print custom table from my data.frame... not ideal. Is there a better way to do this?
Here's how I created the image above, saving my file as an .Rmd in RStudio.
---
title: "Math in R Markdown tables"
output:
html_notebook: default
html_document: default
---
My fancy table
```{r, echo=FALSE, include=TRUE, results="asis"}
# Make data.frame
mathy.df <- data.frame(site = c("A", "B"),
b0 = c(3, 4),
BA = c(1, 2))
# Do terrible things to print it properly
cat("Site|$\\beta_0$|$\\beta_A$")
cat("\n")
cat("----|---------|---------\n")
for (i in 1:nrow(mathy.df)){
cat(as.character(mathy.df[i,"site"]), "|",
mathy.df[i,"b0"], "|",
mathy.df[i,"BA"],
"\n", sep = "")
}
```

You can use kable() and its escape option to format math notation (see this answer to a related question). Then you assign your mathy headings as the column names, and there you go:
---
title: "Math in R Markdown tables"
output:
html_notebook: default
html_document: default
---
My fancy table
```{r, echo=FALSE, include=TRUE, results="asis"}
library(knitr)
mathy.df <- data.frame(site = c("A", "B"),
b0 = c(3, 4),
BA = c(1, 2))
colnames(mathy.df) <- c("Site", "$\\beta_0$", "$\\beta_A$")
kable(mathy.df, escape=FALSE)
```

Naming iterative Rmarkdown documents

I'm trying to create multiple Rmarkdown documents (i.e., letters) that contain all the same text but are addressed to different people and have some unique text for each person. I've been taking a similar approach as the one laid out here:
http://rmarkdown.rstudio.com/articles_mail_merge.html
Basically, I have an R script that creates the Rmarkdown pdfs:
## Packages
library(knitr)
library(rmarkdown)
## Data
personalized_info <- read.csv(file = "meeting_times.csv")
## Loop
for (i in 1:nrow(personalized_info)){
rmarkdown::render(input = "mail_merge_handout.Rmd",
output_format = "pdf_document",
output_file = paste("handout_", i, ".pdf", sep=''),
output_dir = "handouts/")
}
and a .Rmd file to fill in the text below:
---
output: pdf_document
---
```{r echo=FALSE}
personalized_info <- read.csv("meeting_times.csv", stringsAsFactors = FALSE)
name <- personalized_info$name[i]
time <- personalized_info$meeting_time[i]
```
Dear `r name`,
Your meeting time is `r time`.
See you then!
When I run the above R script, I get a folder named "handouts," with files named "handout_1," "handout_2," etc. I would like the files to be named after the person in the dataset, and to do this I changed "i" to "name" under the loop heading of the code. This produces files named like "handout_Ezra Zanders," but the file name does not much the name of the person in the Rmarkdown pdf.
Anyone know of a solution for this in the loop part of the script, or another way of doing this?

You need to add the names in your i loop. By the way, because you are using an external script to run your markdown, it is not necessary to read again the csv file in the mardown.
The R script that creates the Rmarkdown pdfs:
## Packages
library(knitr)
library(rmarkdown)
## Data
personalized_info <- read.csv(file = "meeting_times.csv")
## Loop
for (i in 1:nrow(personalized_info)) {
name <- personalized_info$name[i]
time <- personalized_info$meeting_time[i]
rmarkdown::render(input = "mail_merge_handout.Rmd",
output_format = "pdf_document",
output_file = paste("handout_", name, ".pdf", sep=''),
output_dir = "handouts/")
}
and a .Rmd file to fill in the text below:
---
output: pdf_document
---
Dear `r name`,
Your meeting time is `r time`.
See you then!

purl() within knit() duplicate label error

I am knitting a .Rmd file and want to have two outputs: the html and a purl'ed R script each time I run knit. This can be done with the following Rmd file:
---
title: "Purl MWE"
output: html_document
---
```{r}
## This chunk automatically generates a text .R version of this script when running within knitr.
input = knitr::current_input() # filename of input document
output = paste(tools::file_path_sans_ext(input), 'R', sep = '.')
knitr::purl(input,output,documentation=1,quiet=T)
```
```{r}
x=1
x
```
If you do not name the chunk, it works fine and you get html and .R output each time you run knit() (or click knit in RStudio).
However, if you name the chunk it fails. For example:
title: "Purl MWE"
output: html_document
---
```{r}
## This chunk automatically generates a text .R version of this script when running within knitr.
input = knitr::current_input() # filename of input document
output = paste(tools::file_path_sans_ext(input), 'R', sep = '.')
knitr::purl(input,output,documentation=1,quiet=T)
```
```{r test}
x=1
x
```
It fails with:
Quitting from lines 7-14 (Purl.Rmd)
Error in parse_block(g[-1], g[1], params.src) : duplicate label 'test'
Calls: <Anonymous> ... process_file -> split_file -> lapply -> FUN -> parse_block
Execution halted
If you comment out the purl() call, it will work with the named chunk. So there is something about how the purl() call is also naming chunks which causes knit() to think there are duplicate chunk names even when there are no duplicates.
Is there a way to include a purl() command inside a .Rmd file so both outputs (html and R) are produced? Or is there a better way to do this? My ultimate goal is to use the new rmarkdown::render_site() to build a website that updates the HTML and R output each time the site is compiled.

You can allow duplicate labels by including options(knitr.duplicate.label = 'allow') within the file as follows:
title: "Purl MWE"
output: html_document
---
```{r GlobalOptions}
options(knitr.duplicate.label = 'allow')
```
```{r}
## This chunk automatically generates a text .R version of this script when running within knitr.
input = knitr::current_input() # filename of input document
output = paste(tools::file_path_sans_ext(input), 'R', sep = '.')
knitr::purl(input,output,documentation=1,quiet=T)
```
```{r test}
x=1
x
```
This code isn't documented on the knitr website, but you can keep track with the latest changes direct from Github: https://github.com/yihui/knitr/blob/master/NEWS.md

A related approach to #ruaridhw solution would be to wrap the knitr::purl() in callr::r(). See function below that saves the R chunks from a specified R markdown file to a temporary .R file:
# RMD to local R temp file
# inspiration: https://gist.github.com/noamross/a549ee50e8a4fd68b8b1
rmd_chunks_to_r_temp <- function(file){
temp <- tempfile(fileext=".R")
# needed callr so can use when knitting -- else can bump into "duplicate chunk
# label" errors when running when knitting
callr::r(function(file, temp){
knitr::purl(file, output = temp)
},
args = list(file, temp))
}
This function also exists in funspotr:::rmd_chunks_to_r_temp() at brshallo/funspotr.

You can avoid this error with a bash chunk that calls purl in a separate R session. That way there's no need to allow duplicate labels.
An example use case is an Rmd file where the code is run (and not echo'd) throughout the report and then all the code chunks are shown with chunks names and code comments in an Appendix. If you don't require that additional functionality then you would only need up until the bash chunk.
The idea is that report_end signifies where to stop purl such that the appendix code isn't considered "report code". Then read_chunk reads the entire R file into one code chunk which can then be echo'd with syntax highlighting if required.
---
title: "Purl MWE"
output: html_document
---
These code chunks are used in the background of the report however
their source is not shown until the Appendix.
```{r test1, echo=FALSE}
x <- 1
x
```
```{r test2, echo=FALSE}
x <- x + 1
x
```
```{r test3, echo=FALSE}
x <- x + 1
x
```
# Appendix
```{r, eval=TRUE}
report_end <- "^# Appendix"
temp <- tempfile(fileext = ".R")
Sys.setenv(PURL_IN = shQuote("this_file.Rmd"), # eg. knitr::current_input()
PURL_OUT = shQuote(temp),
PURL_END = shQuote(report_end))
```
```{bash, include=FALSE}
Rscript -e "lines <- readLines($PURL_IN, warn = FALSE)" \
-e "knitr::purl(text = lines[1:grep($PURL_END, lines)], output = $PURL_OUT, documentation = 1L)"
```
```{r, include=FALSE}
knitr::read_chunk(temp, labels = "appendix")
unlink(temp)
```
```{r appendix, eval=FALSE, echo=TRUE}
```

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Different pander performance between .Rmd and .Rnw - r

Related

Run R Markdown on many different datasets and save each knitted word document separately

Run selected chunks from one Rmd in another

Display a data.frame with mathematical notation in table header R Markdown html output

Naming iterative Rmarkdown documents

purl() within knit() duplicate label error

Categories

Resources