Generate a customized Rmarkdown document - r

I have a shiny application in which a user can enter information through a form with various inputs like text, date etc. Once entered, this information is stored in a table in the backend. Now on the basis of this, the user wants to generate a document/pdf in a particular format for printing. Here is a demo of what the final filled up form might look like:
All the fields, for example, the billed to address, payment method, shipped to etc. are entered through the form and are stored in a table. How do I extract this information from a table and show it in this format? I am looking at parameterized Rmarkdown documents, but don't know how to get the format correct.

As for getting parameters into an rmarkdown file, you can use the yaml header. Since your data is most likely coming from a reactive environment, you have to save the object to the global environment using the <<- operator or assign(). These values can then be accessed from the Rmarkdown file with the yaml header. Say you have created an object called address:
---
output: pdf_document
address: NA
---
As of the formatting the report, you will need to use latex code, either inside the Rmarkdown document and/or you specify a .tex file in the yaml header:
---
output:
pdf_document:
includes:
in_header: preamble.tex
address: NA
---

Related

How can I generate .rmd files with param values set programmatically

I have been using this solution to generate r-markdown reports from an interactive (Shiny) app.
Rather than generate and download a html file, I'm looking generate the corresponding .rmd file, since I want to be able to rerun the report with fixed settings at a later point in time.
To provide an example, using the template provided in the link, and for n=40, I want to generate a file containing the following code:
---
title: "Dynamic report"
output: html_document
params:
n: 40
---
```{r}
# The `params` object is available in the document.
params$n
```
In other words, a file in which the placeholder 'NA' from the template is replaced by the actual used value of 40.
Short of manipulating the yaml in the file directly, is there a way to set new parameters in a rmd template and generate the resulting .rmd file?
An extremely simple version to help with your parameters Rmarkdown question. This requires Rstudio. Instead of regularly knitting your document, you knit with parameters...
Now we get a GUI interface where we can specify the value we want, in this case we made it "numeric"
---
title: My Document
output: html_document
params:
input: numeric
---
`r params$input`
Which renders and includes our interactive parameter inside the Rmarkdown HTML rendered document. resources I used for research: resource 1 and resource 2. You can do wayy more with parameters, but this was a simple answer to help provide you with an answer to your question.

How to find out within rmarkdown document to which format it is being rendered? [duplicate]

I would like to include specific content based on which format is being created. In this specific example, my tables look terrible in MS word output, but great in HTML. I would like to add in some test to leave out the table depending on the output.
Here's some pseudocode:
output.format <- opts_chunk$get("output")
if(output.format != "MS word"){
print(table1)
}
I'm sure this is not the correct way to use opts_chunk, but this is the limit of my understanding of how knitr works under the hood. What would be the correct way to test for this?
Short Answer
In most cases, opts_knit$get("rmarkdown.pandoc.to") delivers the required information.
Otherwise, query rmarkdown::all_output_formats(knitr::current_input()) and check if the return value contains word_document:
if ("word_document" %in% rmarkdown::all_output_formats(knitr::current_input()) {
# Word output
}
Long answer
I assume that the source document is RMD because this is the usual/most common input format for knitting to different output formats such as MS Word, PDF and HTML.
In this case, knitr options cannot be used to determine the final output format because it doesn't matter from the perspective of knitr: For all output formats, knitr's job is to knit the input RMD file to a MD file. The conversion of the MD file to the output format specified in the YAML header is done in the next stage, by pandoc.
Therefore, we cannot use the package option knitr::opts_knit$get("out.format") to learn about the final output format but we need to parse the YAML header instead.
So far in theory. Reality is a little bit different. RStudio's "Knit PDF"/"Knit HTML" button calls rmarkdown::render which in turn calls knit. Before this happens, render sets a (undocumented?) package option rmarkdown.pandoc.to to the actual output format. The value will be html, latex or docx respectively, depending on the output format.
Therefore, if (and only if) RStudio's "Knit PDF"/"Knit HTML" button is used, knitr::opts_knit$get("rmarkdown.pandoc.to") can be used to determine the output format. This is also described in this answer and that blog post.
The problem remains unsolved for the case of calling knit directly because then rmarkdown.pandoc.to is not set. In this case we can exploit the (unexported) function parse_yaml_front_matter from the rmarkdown package to parse the YAML header.
[Update: As of rmarkdown 0.9.6, the function all_output_formats has been added (thanks to Bill Denney for pointing this out). It makes the custom function developed below obsolete – for production, use rmarkdown::all_output_formats! I leave the remainder of this answer as originally written for educational purposes.]
---
output: html_document
---
```{r}
knitr::opts_knit$get("out.format") # Not informative.
knitr::opts_knit$get("rmarkdown.pandoc.to") # Works only if knit() is called via render(), i.e. when using the button in RStudio.
rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
```
The example above demonstrates the use(lesness) of opts_knit$get("rmarkdown.pandoc.to") (opts_knit$get("out.format")), while the line employing parse_yaml_front_matter returns the format specified in the "output" field of the YAML header.
The input of parse_yaml_front_matter is the source file as character vector, as returned by readLines. To determine the name of the file currently being knitted, current_input() as suggested in this answer is used.
Before parse_yaml_front_matter can be used in a simple if statement to implement behavior that is conditional on the output format, a small refinement is required: The statement shown above may return a list if there are additional YAML parameters for the output like in this example:
---
output:
html_document:
keep_md: yes
---
The following helper function should resolve this issue:
getOutputFormat <- function() {
output <- rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
if (is.list(output)){
return(names(output)[1])
} else {
return(output[1])
}
}
It can be used in constructs such as
if(getOutputFormat() == 'html_document') {
# do something
}
Note that getOutputFormat uses only the first output format specified, so with the following header only html_document is returned:
---
output:
html_document: default
pdf_document:
keep_tex: yes
---
However, this is not very restrictive. When RStudio's "Knit HTML"/"Knit PDF" button is used (along with the dropdown menu next to it to select the output type), RStudio rearranges the YAML header such that the selected output format will be the first format in the list. Multiple output formats are (AFAIK) only relevant when using rmarkdown::render with output_format = "all". And: In both of these cases rmarkdown.pandoc.to can be used, which is easier anyways.
Since knitr 1.18, you can use the two functions
knitr::is_html_output()
and
knitr::is_latex_output()
Just want to add a bit of clarification here, since I often render the same Rmarkdown file (*.Rmd) into multiple formats (*.html, *.pdf, *.docx), so rather than wanting to know if the format of interest is listed amongst those specified in the front matter yaml (i.e. "word_document" %in% rmarkdown::all_output_formats(knitr::current_input()), I want to know which format is being currently rendered. To do this you can either:
Get first element of formats listed in front matter: rmarkdown::all_output_formats(knitr::current_input()[1]; or
Get default output format name: rmarkdown::default_output_format(knitr::current_input())$name
For example...
---
title: "check format"
output:
html_document: default
pdf_document: default
word_document: default
---
```{r}
rmarkdown::all_output_formats(knitr::current_input())[1]
```
```{r}
rmarkdown::default_output_format(knitr::current_input())$name
```
```{r}
fmt <- rmarkdown::default_output_format(knitr::current_input())$name
if (fmt == "pdf_document"){
#...
}
if (fmt == "word_document"){
#...
}
```
One additional point: the above answers don't work for an html_notebook, since code is being executed directly there and knitr::current_input() doesn't respond. If you know the document name you can call all_output_formats as above, specifying the name explicitly. I don't know if there's another way to do this.
This is what I use
library(stringr)
first_output_format <-
names(rmarkdown::metadata[["output"]])[1]
if (!is.null(first_output_format)) {
my_output <- str_split(first_output_format,"_")[[1]][1]
} else {
my_output = "unknown"
}

Multiple sets of Shared Options in R Markdown

Is it possible to have multiple sets of shared options for R Markdown?
This is my problem: I have a folder with a bunch of markdown files. The files can be divided into two groups:
html_document and
revealjs::revealjs_presentation.
I would like to factor out common YAML code from each of these groups. Now I know that I can create a _output.yaml file which would capture common YAML, but I essentially need to have two of these files, one for each of the output formats.
I saw the use of pandoc_args suggested here and I gave it a try as follows:
---
title: Document Type 1
output:
html_document:
pandoc_args: './common-html.yaml'
---
and
---
title: Document Type 2
output:
revealjs::revealjs_presentation:
pandoc_args: './common-reveal.yaml'
---
However using this setup the options from the included YAML files don't get processed.
Any other suggestions would be appreciated!
You can specify multiple output formats in the same _output.yaml file like this (just some example options):
html_document:
self_contained: false
revealjs::revealjs_presentation:
incremental: true
Then you have to render all output formats which cannot be done directly using the RStudio GUI. Instead you have to enter the following into the R console:
rmarkdown::render(input = "your.Rmd",
output_format = "all")
Ideally make sure that there's no output key in the YAML front matter of the .Rmd document itself. Otherwise the output options in _output.yaml file might get overridden. Unfortunately I couldn't find a comprehensive documentation of the exact behavior. Some of my observations so far:
Output options defined in the YAML front matter of the .Rmd document itself always override those specified in the shared options file _output.yaml.
Specifying an output format using the default option set (like pdf_document: default) in the YAML front matter of the .Rmd document itself completely overrides all options specified in _output.yaml. But if you don't explicitly specify the default options (like output: pdf_document; which is only possible for a single output format at once), the _output.yaml content is fully regarded.
If you have specified options for multiple output formats in _output.yaml, only the first one gets rendered when pressing the knit button in RStudio (even if you explicitly press knit to HTML/PDF/Word). You have to use rmarkdown::render(output_format = "all") to render the other formats, too.

Can an rmarkdown code chunk switch on output type? [duplicate]

I would like to include specific content based on which format is being created. In this specific example, my tables look terrible in MS word output, but great in HTML. I would like to add in some test to leave out the table depending on the output.
Here's some pseudocode:
output.format <- opts_chunk$get("output")
if(output.format != "MS word"){
print(table1)
}
I'm sure this is not the correct way to use opts_chunk, but this is the limit of my understanding of how knitr works under the hood. What would be the correct way to test for this?
Short Answer
In most cases, opts_knit$get("rmarkdown.pandoc.to") delivers the required information.
Otherwise, query rmarkdown::all_output_formats(knitr::current_input()) and check if the return value contains word_document:
if ("word_document" %in% rmarkdown::all_output_formats(knitr::current_input()) {
# Word output
}
Long answer
I assume that the source document is RMD because this is the usual/most common input format for knitting to different output formats such as MS Word, PDF and HTML.
In this case, knitr options cannot be used to determine the final output format because it doesn't matter from the perspective of knitr: For all output formats, knitr's job is to knit the input RMD file to a MD file. The conversion of the MD file to the output format specified in the YAML header is done in the next stage, by pandoc.
Therefore, we cannot use the package option knitr::opts_knit$get("out.format") to learn about the final output format but we need to parse the YAML header instead.
So far in theory. Reality is a little bit different. RStudio's "Knit PDF"/"Knit HTML" button calls rmarkdown::render which in turn calls knit. Before this happens, render sets a (undocumented?) package option rmarkdown.pandoc.to to the actual output format. The value will be html, latex or docx respectively, depending on the output format.
Therefore, if (and only if) RStudio's "Knit PDF"/"Knit HTML" button is used, knitr::opts_knit$get("rmarkdown.pandoc.to") can be used to determine the output format. This is also described in this answer and that blog post.
The problem remains unsolved for the case of calling knit directly because then rmarkdown.pandoc.to is not set. In this case we can exploit the (unexported) function parse_yaml_front_matter from the rmarkdown package to parse the YAML header.
[Update: As of rmarkdown 0.9.6, the function all_output_formats has been added (thanks to Bill Denney for pointing this out). It makes the custom function developed below obsolete – for production, use rmarkdown::all_output_formats! I leave the remainder of this answer as originally written for educational purposes.]
---
output: html_document
---
```{r}
knitr::opts_knit$get("out.format") # Not informative.
knitr::opts_knit$get("rmarkdown.pandoc.to") # Works only if knit() is called via render(), i.e. when using the button in RStudio.
rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
```
The example above demonstrates the use(lesness) of opts_knit$get("rmarkdown.pandoc.to") (opts_knit$get("out.format")), while the line employing parse_yaml_front_matter returns the format specified in the "output" field of the YAML header.
The input of parse_yaml_front_matter is the source file as character vector, as returned by readLines. To determine the name of the file currently being knitted, current_input() as suggested in this answer is used.
Before parse_yaml_front_matter can be used in a simple if statement to implement behavior that is conditional on the output format, a small refinement is required: The statement shown above may return a list if there are additional YAML parameters for the output like in this example:
---
output:
html_document:
keep_md: yes
---
The following helper function should resolve this issue:
getOutputFormat <- function() {
output <- rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
if (is.list(output)){
return(names(output)[1])
} else {
return(output[1])
}
}
It can be used in constructs such as
if(getOutputFormat() == 'html_document') {
# do something
}
Note that getOutputFormat uses only the first output format specified, so with the following header only html_document is returned:
---
output:
html_document: default
pdf_document:
keep_tex: yes
---
However, this is not very restrictive. When RStudio's "Knit HTML"/"Knit PDF" button is used (along with the dropdown menu next to it to select the output type), RStudio rearranges the YAML header such that the selected output format will be the first format in the list. Multiple output formats are (AFAIK) only relevant when using rmarkdown::render with output_format = "all". And: In both of these cases rmarkdown.pandoc.to can be used, which is easier anyways.
Since knitr 1.18, you can use the two functions
knitr::is_html_output()
and
knitr::is_latex_output()
Just want to add a bit of clarification here, since I often render the same Rmarkdown file (*.Rmd) into multiple formats (*.html, *.pdf, *.docx), so rather than wanting to know if the format of interest is listed amongst those specified in the front matter yaml (i.e. "word_document" %in% rmarkdown::all_output_formats(knitr::current_input()), I want to know which format is being currently rendered. To do this you can either:
Get first element of formats listed in front matter: rmarkdown::all_output_formats(knitr::current_input()[1]; or
Get default output format name: rmarkdown::default_output_format(knitr::current_input())$name
For example...
---
title: "check format"
output:
html_document: default
pdf_document: default
word_document: default
---
```{r}
rmarkdown::all_output_formats(knitr::current_input())[1]
```
```{r}
rmarkdown::default_output_format(knitr::current_input())$name
```
```{r}
fmt <- rmarkdown::default_output_format(knitr::current_input())$name
if (fmt == "pdf_document"){
#...
}
if (fmt == "word_document"){
#...
}
```
One additional point: the above answers don't work for an html_notebook, since code is being executed directly there and knitr::current_input() doesn't respond. If you know the document name you can call all_output_formats as above, specifying the name explicitly. I don't know if there's another way to do this.
This is what I use
library(stringr)
first_output_format <-
names(rmarkdown::metadata[["output"]])[1]
if (!is.null(first_output_format)) {
my_output <- str_split(first_output_format,"_")[[1]][1]
} else {
my_output = "unknown"
}

Importing common YAML in rstudio/knitr document

I have a few Rmd documents that all have the same YAML frontmatter except for the title. How can I keep this frontmatter in one file and have it used for all the documents? It is getting rather large and I don't want to keep every file in step every time I tweak the frontmatter.
I want to still
use the Knit button/Ctrl+Shift+K shortcut in RStudio to do the compile
keep the whole setup portable: would like to avoid writing a custom output format or overriding rstudio.markdownToHTML (as this would require me to carry around a .Rprofile too)
Example
common.yaml:
author: me
date: "`r format (Sys.time(), format='%Y-%m-%d %H:%M:%S %z')`"
link-citations: true
reference-section-title: References
# many other options
an example document
----
title: On the Culinary Preferences of Anthropomorphic Cats
----
I do not like green eggs and ham. I do not like them, Sam I Am!
Desired output:
The compiled example document (ie either HTML or PDF), which has been compiled with the metadata in common.yaml injected in. The R code in the YAML (in this case, the date) would be compiled as a bonus, but it is not necessary (I only use it for the date which I don't really need).
Options/Solutions?
I haven't quite got any of these working yet.
With rmarkdown one can create a _output.yaml to put common YAML metadata, but this will put all of that metadata under output: in the YAML so is only good for options under html_document: and pdf_document:, and not for things like author, date, ...
write a knitr chunk to import the YAML, e.g.
----
title: On the Culinary Preferences of Anthropomorphic Cats
```{r echo=F, results='asis'}
cat(readLines('common.yaml'), sep='\n')
```
----
I do not like green eggs and ham. I do not like them, Sam I Am!
This works if I knitr('input.Rmd') and then pandoc the output, but not if I use the Knit button from Rstudio (which I assume calls render), because this parses the metadata first before running knitr, and the metadata is malformed until knitr has been run.
Makefile: if I was clever enough I could write a Makefile or something to inject common.yaml into input.Rmd, then run rmarkdown::render(), and somehow hook it up to the Knit button of Rstudio, and perhaps somehow save this Rstudio configuration into the .Rproj file so that the whole thing is portable without me needing to edit .Rprofile too. But I'm not clever enough.
EDIT: I had a go at this last option and hooked up a Makefile to the Build command (Ctrl+Shift+B). However, this will build the same target every time I use it via Ctrl+Shift+B, and I want to build the target that corresponds with the Rmd file I currently have open in the editor [as for Ctrl+Shift+K].
Have found two options to do this portably (ie no .Rprofile customisation needed, minimal duplication of YAML frontmatter):
You can provide common yaml to pandoc on the command-line! d'oh!
You can set the knit: property of the metadata to your own function to have greater control over what happens when you Ctrl+Shift+K.
Option 1: common YAML to command line.
Put all the common YAML in its own file
common.yaml:
---
author: me
date: "`r format (Sys.time(), format='%Y-%m-%d %H:%M:%S %z')`"
link-citations: true
reference-section-title: References
---
Note it's complete, ie the --- are needed.
Then in the document you can specify the YAML as the last argument to pandoc, and it'll apply the YAML (see this github issue)
in example.rmd:
---
title: On the Culinary Preferences of Anthropomorphic Cats
output:
html_document:
pandoc_args: './common.yaml'
---
I do not like green eggs and ham. I do not like them, Sam I Am!
You could even put the html_document: stuff in an _output.yaml since rmarkdown will take that and place it under output: for all the documents in that folder. In this way there can be no duplication of YAML between all documents using this frontmatter.
Pros:
no duplication of YAML frontmatter.
very clean
Cons:
the common YAML is not passed through knit, so the date field above will not be parsed. You will get the literal string "r format(Sys.time(), format='%Y-%m-%d %H:%M:%S %z')" as your date.
from the same github issue:
Metadata definitions seen first are kept and left unchanged, even if conflicting data is parsed at a later point.
Perhaps this could be a problem at some point depending on your setup.
Option 2: override the knit command
This allows for much greater control, though is a bit more cumbersome/tricky.
This link and this one mention an undocumented feature in rmarkdown: the knit: part of the YAML will be executed when one clicks the "Knit" button of Rstudio.
In short:
define a function myknit(inputFile, encoding) that would read the YAML, put it in to the RMD and call render on the result. Saved in its own file myknit.r.
in the YAML of example.rmd, add
knit: (function (...) { source('myknit.r'); myknit(...) })
It seems to have to be on one line. The reason for source('myknit.r') instead of just putting the function definition int he YAML is for portability. If I modify myknit.r I don't have to modify every document's YAML. This way, the only common YAML that all documents must repeat in their frontmatter is the knit line; all other common YAML can stay in common.yaml.
Then Ctrl+Shift+K works as I would hope from within Rstudio.
Further notes:
myknit could just be a system call to make if I had a makefile setup.
the injected YAML will be passed through rmarkdown and hence knitted, since it is injected before the call to render.
Preview window: so long as myknit produces a (single) message Output created: path/to/file.html, then the file will be shown in the preview window.
I have found that there can be only one such message in the output [not multiple], or you get no preview window. So if you use render (which makes an "Output created: basename.extension") message and the final produced file is actually elsewhere, you will need to suppress this message via either render(..., quiet=T) or suppressMessages(render(...)) (the former suppresses knitr progress and pandoc output too), and create your own message with the correct path.
Pros:
the YAML frontmatter is knitted
much more control than option 1 if you need to do custom pre- / post-processing.
Cons:
a bit more effort than option 1
the knit: line must be duplicated in each document (though by source('./myknit.r') at least the function definition may be stored in one central location)
Here is the setup for posterity. For portability, you only need to carry around myknit.r and common.yaml. No .Rprofile or project-specific config needed.
example.rmd:
---
title: On the Culinary Preferences of Anthropomorphic Cats
knit: (function (...) { source('myknit.r'); myknit(...) })
---
I do not like green eggs and ham. I do not like them, Sam I Am!
common.yaml [for example]:
author: me
date: "`r format (Sys.time(), format='%Y-%m-%d %H:%M:%S %z')`"
link-citations: true
reference-section-title: References
myknit.r:
myknit <- function (inputFile, encoding, yaml='common.yaml') {
# read in the YAML + src file
yaml <- readLines(yaml)
rmd <- readLines(inputFile)
# insert the YAML in after the first ---
# I'm assuming all my RMDs have properly-formed YAML and that the first
# occurence of --- starts the YAML. You could do proper validation if you wanted.
yamlHeader <- grep('^---$', rmd)[1]
# put the yaml in
rmd <- append(rmd, yaml, after=yamlHeader)
# write out to a temp file
ofile <- file.path(tempdir(), basename(inputFile))
writeLines(rmd, ofile)
# render with rmarkdown.
message(ofile)
ofile <- rmarkdown::render(ofile, encoding=encoding, envir=new.env())
# copy back to the current directory.
file.copy(ofile, file.path(dirname(inputFile), basename(ofile)), overwrite=T)
}
Pressing Ctrl+Shift+K/Knit from the editor of example.rmd will compile the result and show a preview. I know it is using common.yaml, because the result includes the date and author whereas example.rmd on its own does not have a date or author.
The first solution proposed by #mathematical.coffee is a good approach, but the example it gave did not work for me (maybe because the syntax had changed). As said so, this is possible by providing pandoc arguments in the YAML header. For example,
It's the content of a header.yaml file:
title: "Crime and Punishment"
author: "Fyodor Dostoevsky"
Add this to the beginning of the RMarkdown file:
---
output:
html_document:
pandoc_args: ["--metadata-file=header.yaml"]
---
See the pandoc manual for the --metadata-file argument.

Resources