With one R markdown file, I would like to create different possible output pdf documents, where the output file name should be defined within the document. Is there any way to convince markdown to manipulate the output filename in such a way? Ideally I would like to pass the filename by an r chunk.
You can keep the simplicity of using the RStudio Knit button and reproducibility of a YAML header by using the undocumented knit hook to redefine what the button does (default function called is rmarkdown::render). The output_file parameter of the render function specifies the file name, so by setting it you override the standard behaviour of using the same prefix as the input filename.
e.g. to always output a file called myfile.pdf
knit: (function(inputFile, encoding) { rmarkdown::render(inputFile, encoding = encoding, output_file = file.path(dirname(inputFile), 'myfile.pdf')) })
The function can be an anonymous one-liner as well as imported from a package, as seen here with slidify.
You can set your own YAML headers (I don't know if this is generally advised anyway), accessible under rmarkdown::metadata$newheader but they don't seem available from within this sort of function as far as I can see.
As for passing file name in from an R chunk... if you're referring to code chunks below the YAML header, from my experience I don't think that's possible(?). Headers can contain inline R commands (single backtick-enclosed, starting with r), but seemingly not for this hook function.
Related:
Rmarkdown GitHub repo issue — output format-specific output_file
Blog post I wrote following this question [invalid link, domain for sale as of 20210216] / corresponding GitHub wiki notes
This is pretty much what I do:
rmarkdown::render('my_markdown_report.Rmd',
output_file = paste('report.', Sys.Date(),
'.pdf', sep=''))
I have three scripts - one pulls the data and process it, second created charts & tables for report. Third one creates report based on markdown file. Code you see above is the part of the third script
I played around with the Knitr-hook without fully understanding how it works and ran into an ugly workaround. The below coding seems to do the trick.
Would be nice if somebody can either explain why it works and/or if it can written less ugly.
For now I lost the shiny input screen but believe this can even be added later. The good thing is that the R-Studio Knit button can still be used.
Please note that the subtitle and the file name are both: This Works! even with space and exclamation mark. The file is saved as This Works!.pdf
The filename and subtitle are set by assigning the text to the object pSubTitle.
Note that the params are still in the YAML but are not resulting in a shiny popup screen as they are assigned in the Knitr-hook
---
params:
sub_title:
input: text
label: Sub Title
value: 'my_Sub_Title_and_File_Name'
title : "Parameterized_Title_and_output_file"
subtitle : "`r params$sub_title`"
output:
pdf_document:
keep_tex: false
knit: (
function(inputFile, encoding) {
pSubTitle <- 'This Works!'
rmarkdown::render(
input = inputFile,
encoding = encoding,
params = list(sub_title = pSubTitle),
output_file = pSubTitle) })
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. ....
My current solution for this and similar questions is through 2 scripts:
Script1: "xxx.md" file with flexible yaml header, similar to Floris Padt's. This header allows you to generate flexible pdf files with specified title, dates, and other features if you change the params. However, it could not specify flexible pdf names when you render it.
---
params:
feature_input: "XXXA"
date: "08/18/2022"
title: "`Test For `r params$feature_input``"
author: "You Name"
date: "`r params$date`"
output:
pdf_document:
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown and data process
```
#Get parameter from yaml head
input_para <- params$feature_input
input_para
```
Script2: "YYY.R", which specify params in xxx.md file, and specify pdf file names by output_file when rendering.
featureNames <- c("aaa", "bbb", "ccc")
setwd("path to xxx.md")
for (currentFeature in featureNames) {
rmarkdown::render("xxx.Rmd",
params = list(feature_input = currentFeature,
date = Sys.Date()),
output_file=paste0("output/",currentFeature))
}
You could update featureNames in yyy.R file, and run yyy.R to get the most flexible use of your xxx.md file.
This solution allows you to:
update yaml header parameters,
apply updated yaml parameters in your .md code chunk,
and save your pdf with specific and flexible names.
Related
How do I customize the filename of r quarto pdf document when I render? Before when I was using Rmarkdown I was using the following code in the YAML:
---
title: "Some title"
author: "First Last"
date: "`r format(Sys.Date(), '%d %B, %Y')`"
output: pdf_document
knit: (function(inputFile, encoding) { rmarkdown::render(inputFile, encoding = encoding, output_file = file.path(dirname(inputFile), paste0(Sys.Date(),"_Report","_FirstLast",".pdf"))) })
---
When I hit the "Knit" button the filename of the pdf document would be 2022-08-08_Report_FirstLast.pdf
Is there a way to do this with quarto pdf? I think the quarto_render function needs to be used but don't know how.
Use the output_file argument of quarto_render function.
file_name = paste0(Sys.Date(),"_Report","_FirstLast", ".pdf")
quarto_render("your_qmd_file.qmd", output_file = file_name, output_format = "pdf")
Now about the approach you are trying,
Firstly, there's no such knit yaml key
in quarto AFAIK
Secondly, although r-code can be used in code-chunk option prefixed be !expr , it's not possible to use inline R code in document yaml section right at this moment of answering this question (03 Sep, 2022). See this discussion on Github.
Though there are some suggested workaround for using r-code in yaml in this discussion on Github, but using quarto_render to control output filename seems the easiest option in your case.
And additionally, if your output file name is simple (that is, without any r-code syntax), you can use output-file yaml option.
---
title: "Testing Output file name"
format:
pdf:
output-file: "output_file_name"
output-ext: "pdf"
---
## Quarto
Quarto enables you to weave together content
and executable code into a finished document.
To learn more about Quarto
see <https://quarto.org>.
## Running Code
When you click the **Render** button
a document will be generated that
includes both content and the output of
embedded code.
This will create the output file named output_file_name.pdf in the directory where the source file is.
If you're comfortable using Quarto CLI in bash shell (which may be more convenient for automatic report generation), you could date-stamp your outputs like this.
now=`date +"%Y-%m-%d"`
quarto render my_doc.qmd --output "./out_$now.pdf" --to pdf
I would like to include specific content based on which format is being created. In this specific example, my tables look terrible in MS word output, but great in HTML. I would like to add in some test to leave out the table depending on the output.
Here's some pseudocode:
output.format <- opts_chunk$get("output")
if(output.format != "MS word"){
print(table1)
}
I'm sure this is not the correct way to use opts_chunk, but this is the limit of my understanding of how knitr works under the hood. What would be the correct way to test for this?
Short Answer
In most cases, opts_knit$get("rmarkdown.pandoc.to") delivers the required information.
Otherwise, query rmarkdown::all_output_formats(knitr::current_input()) and check if the return value contains word_document:
if ("word_document" %in% rmarkdown::all_output_formats(knitr::current_input()) {
# Word output
}
Long answer
I assume that the source document is RMD because this is the usual/most common input format for knitting to different output formats such as MS Word, PDF and HTML.
In this case, knitr options cannot be used to determine the final output format because it doesn't matter from the perspective of knitr: For all output formats, knitr's job is to knit the input RMD file to a MD file. The conversion of the MD file to the output format specified in the YAML header is done in the next stage, by pandoc.
Therefore, we cannot use the package option knitr::opts_knit$get("out.format") to learn about the final output format but we need to parse the YAML header instead.
So far in theory. Reality is a little bit different. RStudio's "Knit PDF"/"Knit HTML" button calls rmarkdown::render which in turn calls knit. Before this happens, render sets a (undocumented?) package option rmarkdown.pandoc.to to the actual output format. The value will be html, latex or docx respectively, depending on the output format.
Therefore, if (and only if) RStudio's "Knit PDF"/"Knit HTML" button is used, knitr::opts_knit$get("rmarkdown.pandoc.to") can be used to determine the output format. This is also described in this answer and that blog post.
The problem remains unsolved for the case of calling knit directly because then rmarkdown.pandoc.to is not set. In this case we can exploit the (unexported) function parse_yaml_front_matter from the rmarkdown package to parse the YAML header.
[Update: As of rmarkdown 0.9.6, the function all_output_formats has been added (thanks to Bill Denney for pointing this out). It makes the custom function developed below obsolete – for production, use rmarkdown::all_output_formats! I leave the remainder of this answer as originally written for educational purposes.]
---
output: html_document
---
```{r}
knitr::opts_knit$get("out.format") # Not informative.
knitr::opts_knit$get("rmarkdown.pandoc.to") # Works only if knit() is called via render(), i.e. when using the button in RStudio.
rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
```
The example above demonstrates the use(lesness) of opts_knit$get("rmarkdown.pandoc.to") (opts_knit$get("out.format")), while the line employing parse_yaml_front_matter returns the format specified in the "output" field of the YAML header.
The input of parse_yaml_front_matter is the source file as character vector, as returned by readLines. To determine the name of the file currently being knitted, current_input() as suggested in this answer is used.
Before parse_yaml_front_matter can be used in a simple if statement to implement behavior that is conditional on the output format, a small refinement is required: The statement shown above may return a list if there are additional YAML parameters for the output like in this example:
---
output:
html_document:
keep_md: yes
---
The following helper function should resolve this issue:
getOutputFormat <- function() {
output <- rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
if (is.list(output)){
return(names(output)[1])
} else {
return(output[1])
}
}
It can be used in constructs such as
if(getOutputFormat() == 'html_document') {
# do something
}
Note that getOutputFormat uses only the first output format specified, so with the following header only html_document is returned:
---
output:
html_document: default
pdf_document:
keep_tex: yes
---
However, this is not very restrictive. When RStudio's "Knit HTML"/"Knit PDF" button is used (along with the dropdown menu next to it to select the output type), RStudio rearranges the YAML header such that the selected output format will be the first format in the list. Multiple output formats are (AFAIK) only relevant when using rmarkdown::render with output_format = "all". And: In both of these cases rmarkdown.pandoc.to can be used, which is easier anyways.
Since knitr 1.18, you can use the two functions
knitr::is_html_output()
and
knitr::is_latex_output()
Just want to add a bit of clarification here, since I often render the same Rmarkdown file (*.Rmd) into multiple formats (*.html, *.pdf, *.docx), so rather than wanting to know if the format of interest is listed amongst those specified in the front matter yaml (i.e. "word_document" %in% rmarkdown::all_output_formats(knitr::current_input()), I want to know which format is being currently rendered. To do this you can either:
Get first element of formats listed in front matter: rmarkdown::all_output_formats(knitr::current_input()[1]; or
Get default output format name: rmarkdown::default_output_format(knitr::current_input())$name
For example...
---
title: "check format"
output:
html_document: default
pdf_document: default
word_document: default
---
```{r}
rmarkdown::all_output_formats(knitr::current_input())[1]
```
```{r}
rmarkdown::default_output_format(knitr::current_input())$name
```
```{r}
fmt <- rmarkdown::default_output_format(knitr::current_input())$name
if (fmt == "pdf_document"){
#...
}
if (fmt == "word_document"){
#...
}
```
One additional point: the above answers don't work for an html_notebook, since code is being executed directly there and knitr::current_input() doesn't respond. If you know the document name you can call all_output_formats as above, specifying the name explicitly. I don't know if there's another way to do this.
This is what I use
library(stringr)
first_output_format <-
names(rmarkdown::metadata[["output"]])[1]
if (!is.null(first_output_format)) {
my_output <- str_split(first_output_format,"_")[[1]][1]
} else {
my_output = "unknown"
}
I would like to include specific content based on which format is being created. In this specific example, my tables look terrible in MS word output, but great in HTML. I would like to add in some test to leave out the table depending on the output.
Here's some pseudocode:
output.format <- opts_chunk$get("output")
if(output.format != "MS word"){
print(table1)
}
I'm sure this is not the correct way to use opts_chunk, but this is the limit of my understanding of how knitr works under the hood. What would be the correct way to test for this?
Short Answer
In most cases, opts_knit$get("rmarkdown.pandoc.to") delivers the required information.
Otherwise, query rmarkdown::all_output_formats(knitr::current_input()) and check if the return value contains word_document:
if ("word_document" %in% rmarkdown::all_output_formats(knitr::current_input()) {
# Word output
}
Long answer
I assume that the source document is RMD because this is the usual/most common input format for knitting to different output formats such as MS Word, PDF and HTML.
In this case, knitr options cannot be used to determine the final output format because it doesn't matter from the perspective of knitr: For all output formats, knitr's job is to knit the input RMD file to a MD file. The conversion of the MD file to the output format specified in the YAML header is done in the next stage, by pandoc.
Therefore, we cannot use the package option knitr::opts_knit$get("out.format") to learn about the final output format but we need to parse the YAML header instead.
So far in theory. Reality is a little bit different. RStudio's "Knit PDF"/"Knit HTML" button calls rmarkdown::render which in turn calls knit. Before this happens, render sets a (undocumented?) package option rmarkdown.pandoc.to to the actual output format. The value will be html, latex or docx respectively, depending on the output format.
Therefore, if (and only if) RStudio's "Knit PDF"/"Knit HTML" button is used, knitr::opts_knit$get("rmarkdown.pandoc.to") can be used to determine the output format. This is also described in this answer and that blog post.
The problem remains unsolved for the case of calling knit directly because then rmarkdown.pandoc.to is not set. In this case we can exploit the (unexported) function parse_yaml_front_matter from the rmarkdown package to parse the YAML header.
[Update: As of rmarkdown 0.9.6, the function all_output_formats has been added (thanks to Bill Denney for pointing this out). It makes the custom function developed below obsolete – for production, use rmarkdown::all_output_formats! I leave the remainder of this answer as originally written for educational purposes.]
---
output: html_document
---
```{r}
knitr::opts_knit$get("out.format") # Not informative.
knitr::opts_knit$get("rmarkdown.pandoc.to") # Works only if knit() is called via render(), i.e. when using the button in RStudio.
rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
```
The example above demonstrates the use(lesness) of opts_knit$get("rmarkdown.pandoc.to") (opts_knit$get("out.format")), while the line employing parse_yaml_front_matter returns the format specified in the "output" field of the YAML header.
The input of parse_yaml_front_matter is the source file as character vector, as returned by readLines. To determine the name of the file currently being knitted, current_input() as suggested in this answer is used.
Before parse_yaml_front_matter can be used in a simple if statement to implement behavior that is conditional on the output format, a small refinement is required: The statement shown above may return a list if there are additional YAML parameters for the output like in this example:
---
output:
html_document:
keep_md: yes
---
The following helper function should resolve this issue:
getOutputFormat <- function() {
output <- rmarkdown:::parse_yaml_front_matter(
readLines(knitr::current_input())
)$output
if (is.list(output)){
return(names(output)[1])
} else {
return(output[1])
}
}
It can be used in constructs such as
if(getOutputFormat() == 'html_document') {
# do something
}
Note that getOutputFormat uses only the first output format specified, so with the following header only html_document is returned:
---
output:
html_document: default
pdf_document:
keep_tex: yes
---
However, this is not very restrictive. When RStudio's "Knit HTML"/"Knit PDF" button is used (along with the dropdown menu next to it to select the output type), RStudio rearranges the YAML header such that the selected output format will be the first format in the list. Multiple output formats are (AFAIK) only relevant when using rmarkdown::render with output_format = "all". And: In both of these cases rmarkdown.pandoc.to can be used, which is easier anyways.
Since knitr 1.18, you can use the two functions
knitr::is_html_output()
and
knitr::is_latex_output()
Just want to add a bit of clarification here, since I often render the same Rmarkdown file (*.Rmd) into multiple formats (*.html, *.pdf, *.docx), so rather than wanting to know if the format of interest is listed amongst those specified in the front matter yaml (i.e. "word_document" %in% rmarkdown::all_output_formats(knitr::current_input()), I want to know which format is being currently rendered. To do this you can either:
Get first element of formats listed in front matter: rmarkdown::all_output_formats(knitr::current_input()[1]; or
Get default output format name: rmarkdown::default_output_format(knitr::current_input())$name
For example...
---
title: "check format"
output:
html_document: default
pdf_document: default
word_document: default
---
```{r}
rmarkdown::all_output_formats(knitr::current_input())[1]
```
```{r}
rmarkdown::default_output_format(knitr::current_input())$name
```
```{r}
fmt <- rmarkdown::default_output_format(knitr::current_input())$name
if (fmt == "pdf_document"){
#...
}
if (fmt == "word_document"){
#...
}
```
One additional point: the above answers don't work for an html_notebook, since code is being executed directly there and knitr::current_input() doesn't respond. If you know the document name you can call all_output_formats as above, specifying the name explicitly. I don't know if there's another way to do this.
This is what I use
library(stringr)
first_output_format <-
names(rmarkdown::metadata[["output"]])[1]
if (!is.null(first_output_format)) {
my_output <- str_split(first_output_format,"_")[[1]][1]
} else {
my_output = "unknown"
}
I am trying to modify the behavior of RStudio's knit button, by changing the directory to which it writes the output of knitting the Rmd file. I have started with this answer, but instead of having the filename given by a fixed string, I'd like to have the output filename based on the Rmd filename. However, the variable inputFile includes the full path to the Rmd file. Is there a way to get only the filename without the path?
The header I am working with that produces the full path+filename where I'd like just the filename (test2 is a directory that I created in the current working directory):
---
knit: (function(inputFile, encoding) {rmarkdown::render(inputFile,encoding=encoding, output_file=file.path(dirname(inputFile), "test2", paste0(substr(inputFile,1,nchar(inputFile-4),".html"))) })
output: html_document
---
I'm still interested in a command that would directly give me the input filename, as specified in the question, but I found a workaround for the particular issue, using regex in the substr call, based on this:
knit: (function(inputFile, encoding) {rmarkdown::render(inputFile,encoding=encoding, output_file=file.path(dirname(inputFile), "test2", paste0(substr(inputFile,rev(gregexpr("/", inputFile)[[1]])[1]+1,nchar(inputFile)-4),".html"))) })
Nowadays, knitr comes with a function current_input() that gives you the name of the Rmd-file as a string. And tools::file_path_sans_ext() will remove the extension.
But to solve the exact problem of the OP, there are probably better options today, for example, knitr options, the package ezknitr, RStudio "Knit Directory" button, or here::here().
One of the features I like very much in Sweave is the option to have \SweaveInput{} of separate Sweave files to have a more "modular" report and just be able to comment out parts of the report that I do not want to be generated with a single #\SweaveInput{part_x} rather than having to comment in or out entire blocks of code.
Recently I decided to move to R Markdown for multiple reasons being mainly practicality, the option of interactive (Shiny) integration in the report and the fact that I do not really need the extensive formatting options of LaTeX.
I found that technically pandoc is able to combine multiple Rmd files into one html output by just concatenating them but it would be nice if this behaviour could be called from a "master" Rmd file.
Any answer would be greatly appreciated even if it is just "go back to Sweave, it is not possible in Markdown".
I am using R 3.1.1 for Windows and Linux as well as Rstudio 0.98.1056 and Rstudio server 0.98.983.
Use something like this in the main document:
```{r child="CapsuleRInit.Rmd"}
```
```{r child="CapsuleTitle.Rmd", eval=TRUE}
```
```{r child="CapsuleBaseline.Rmd", eval=TRUE}
```
Use eval=FALSE to skip one child.
For RStudio users: you can define a main document for latex output, but this does not work for RMD documents, so you always have to switch to the main document for processing. Please support my feature request to RStudio; I tried already twice, but is seems to me that too few people use child docs to put it higher in the priority list.
I don't quite understand some of the terms in the answer above, but the solution relates to defining a custom knit: hook in the YAML header. For multipartite documents this allows you to, for example:
Have a 'main' or 'root' Rmarkdown file with an output: markdown_document YAML header
render all child documents from Rmd ⇒ md ahead of calling render, or not if this is time-limiting
combine multiple files (with the child code chunk option) into one (e.g. for chapters in a report)
write output: html_document (or other format) YAML headers for this compilation output on the fly, prepending to the markdown effectively writing a fresh Rmarkdown file
...then render this Rmarkdown to get the output, deleting intermediate files in the process if desired
The code for all of the above (dumped here) is described here, a post I wrote after working out the usage of custom knit: YAML header hooks recently (here).
The custom knit: function (i.e. the replacement to rmarkdown::render) in the above example is:
(function(inputFile, encoding) {
input.dir <- normalizePath(dirname(inputFile))
rmarkdown::render(input = inputFile, encoding = encoding, quiet=TRUE,
output_file = paste0(input.dir,'/Workbook-tmp.Rmd'))
sink("Workbook-compiled.Rmd")
cat(readLines(headerConn <- file("Workbook-header.yaml")), sep="\n")
close(headerConn)
cat(readLines(rmdConn <- file("Workbook-tmp.Rmd")), sep="\n")
close(rmdConn)
sink()
rmarkdown::render(input = paste0(input.dir,'/Workbook-compiled.Rmd'),
encoding = encoding, output_file = paste0(input.dir,'/../Workbook.html'))
unlink(paste0(input.dir,'/Workbook-tmp.Rmd'))
})
...but all squeezed onto 1 line!
The rest of the 'master'/'root'/'control' file or whatever you want to call it takes care of writing the aforementioned YAML for the final HTML output that goes via an intermediate Rmarkdown file, and its second code chunk programmatically appends child documents through a call to list.files()
```{r include=FALSE}
header.input.file <- "Workbook-header.yaml"
header.input.dir <- normalizePath(dirname(header.input.file))
fileConn <- file(header.input.file)
writeLines(c(
"---",
paste0('title: "', rmarkdown::metadata$title,'"'),
"output:",
" html_document:",
" toc: true",
" toc_depth: 3 # defaults to 3 anyway, but just for ease of modification",
" number_sections: TRUE",
paste0(" css: ",paste0(header.input.dir,'/../resources/workbook_style.css')),
' pandoc_args: ["--number-offset=1", "--atx-headers"]',
"---", sep="\n"),
fileConn)
close(fileConn)
```
```{r child = list.files(pattern = 'Notes-.*?\\.md')}
# Use file names with strict numeric ordering, e.g. Notes-001-Feb1.md
```
The directory structure would contain a top-level folder with
A final output Workbook.html
A resources subfolder containing workbook_style.css
A documents subfolder containing said main file "Workbook.Rmd" alongside files named as "Notes-001.Rmd", "Notes-002.Rmd" etc. (to ensure a fileglobbing on list.files(pattern = "Notes-.*?\\.Rmd) finds and thus makes them children in the correct order when rendering the main Workbook.Rmd file)
To get proper numbering of files, each constituent "Notes-XXX.Rmd" file should contain the following style YAML header:
---
title: "March: Doing x, y, and z"
knit: (function(inputFile, encoding) { input.dir <- normalizePath(dirname(inputFile)); rmarkdown::render(input = inputFile, encoding = encoding, quiet=TRUE)})
output:
md_document:
variant: markdown_github
pandoc_args: "--atx-headers"
---
```{r echo=FALSE, results='asis', comment=''}
cat("##", rmarkdown::metadata$title)
```
The code chunk at the top of the Rmarkdown document enters the YAML title as a second-level header when evaluated. results='asis' indicates to return plain text-string rather than
[1] "A text string"
You would knit each of these before knitting the main file - it's easier to remove the requirement to render all child documents and just append their pre-produced markdown output.
I've described all of this at the links above, but I thought it'd be bad manners not to leave the actual code with my answer.
I don't know how effective that RStudio feature request website may be... Personally I've not found it hard to look into the source code for these functions, which thankfully are open source, and if there really is something absent rather than undocumented an inner-workings-informed feature request is likely far more actionable by one of their software devs.
I'm not familiar with Sweave, was the above was what you were aiming at? If I understand correctly you just want to control the inclusion of documents in a modular fashion. The child = list.files() statement could take care of that: if not through file globbing you can straight-up list files as child = c("file1.md","file2md")... and switch that statement to change the children. You can also control TRUE/FALSE switches with YAML, whereby the presence of a custom header would set some children to be included for example
potentially.absent.variable: TRUE
...above the document with a silent include=FALSE hiding the machinations of the first chunk:
```{r include=FALSE}
!all(!as.logical(rmarkdown::metadata$potentially.absent.variable)
# ==> FALSE if potentially.absent.variable is absent
# ==> FALSE if anything other than TRUE
# ==> TRUE if TRUE
checkFor <- function(var) {
return !all(!as.logical(rmarkdown::metadata[[var]])
}
```
```{r child = "Optional_file.md", eval = checkFor(potentially.absent.variable)}
```