I am using rmarkdown to do 5 main steps: reading, cleaning, transformations, analysis and viz.
My plan is to divide the code for each step into one or two markdown files (.Rmd). Each following .Rmd will be called as an external file into the next .Rmd (i.e I want to nest my .Rmd of each step into the next one).
I am nesting the .Rmd files by replacing:
knitr::opts_chunk$set(echo = TRUE)
with:
knitr::knit("**ABSOLUTE PATH TO PREVIOUS .Rmd FILE**")
This worked, until my 3rd nesting, when I get an error at the knitr::knit line:
Error in file(file, "rt" ) Cannot open the connection calls: <Anonymous> … withVisible ->eval ->eval -> read.csv ->read.table -> file
NOTE: every time I have referenced anything this has been done with absolute paths, so this shouldn't be the issue.
Can anyone point me to the correct way of nesting .Rmd into eachother?
Finally, if this workflow seems savage, I would welcome any other suggestions on architecture! My reasons for the nesting (as opposed to putting everything in a big R Notebook are) are:
I want to work in .Rmd 'pages' to have separate tabs with less code I need to scroll through (and potentially mess up by accident).
I can pass data from one step to the other without losing time to write it somewhere on my disk.
(Basically I want to be able to evaluate and use the results from the previous .Rmd)
I can make a clean html or pdf page for every step of the process.
To combine multiple child Rmd files, I use a general Rmd one that call all childs. In that way, you can also decide not to show some of them with eval=FALSE directly in this main file.
More info on https://yihui.name/knitr/demo/child/
The main file would look like any Rmd:
---
title: Modelling distribution of species in the Bay of Biscay
author: "StatnMap - Sébastien Rochette"
date: '`r format(Sys.time(), "%d %B, %Y")`'
output:
html_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
<!-- Content -->
```{r reading, child='Reading.Rmd', eval=TRUE}
```
```{r cleaning, child='Cleaning.Rmd', eval=TRUE}
```
```{r analysis, child='Analysis.Rmd', eval=TRUE}
```
Related
Using R exams, I am developing a pdf exam with several questions (hence several Rmd files) but the questions are connected and would use a dataset created in the first question file. Questions would not be amenable to a cloze format.
Is there a way to write the exercises so that the second exercise can access the data generated by the first exercise ?
The easiest solution is to use a shared environment across the different exercises, in the simplest case the .GlobalEnv. Then you can simply do
exams2pdf(c("ex1.Rmd", "ex2.Rmd"), envir = .GlobalEnv)
and then both exercises will create their variables in the global environment and can re-use existing variables from there. Instead of the .GlobalEnv you can also create myenv <- new.env() and use envir = myenv.
For Rnw (as opposed to Rmd) exercises, it is not necessary to set this option because Sweave() Rnw exercises are always processed in the current environment anyway.
Note that these approaches only work for those exams2xyz() interfaces, where the n-th random draw from each exercise can be assured to end up together in the n-the exam. This is the case for PDF output but not for many of the learning management system outputs (Moodle, Canvas, etc.). See: Sharing a random CSV data set across exercises with exams2moodle()
Is it an option to save the data you need to disk in one Rmd file
```{r, echo=FALSE}
saveRDS(df, "my_stored_data.rds")
```
and then load it in the other one
```{r, echo=FALSE}
readRDS(df, "my_stored_data.rds")
```
Another option could be to knit the Rmd files from an R script and then knit them from this R script. If you do that, the Rmd files use the environment of the R script (!) instead of creating their own. Hence you can use the same objects (and therefore of course let one Rmd script store the data, while the other uses it as input.
In this thread: Create sections through a loop with knitr
there is a post from me about doing this. It's basically this:
The first Rmd file:
---
title: "Script 1"
output: html_document
---
```{r setup, include=FALSE}
a_data_frame_created_in_script_1 <- mtcars
```
saved as rmd_test.Rmd
The second one:
---
title: "Script 1"
output: html_document
---
```{r setup}
a_data_frame_created_in_script_1
```
saved as rmd_test_2.Rmd.
And then you have an R-script that does this:
rmarkdown::render("rmd_test.Rmd", output_file = "rmd_test.html")
rmarkdown::render("rmd_test_2.Rmd", output_file = "rmd_test_2.html")
I have created a report PerfReport.Rmd that compares one person's performance to the performance of a large group. I have hundreds of individuals that should receive a personalized version of this report, comparing their own performance to the large group.
Obstacles I see are:
1. I need each filename to include the person's name.
2. Each file has calculations that are specific to the person.
Here's an example .Rmd
---
title: "Course Success"
output:
html_document: flexdashboard::flex_dashboard
pdf_document: default
---
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(dplyr)
library(plotly)
```
```{r comp, echo=FALSE, message=FALSE,warning=FALSE}
df<-data.frame(Person=c('a','a','a','a','a','b','b','b','b','b','c','c','c','c','c','d','d','d','d','d'),
Success=c(1,0,1,1,0,1,0,0,0,0,1,1,1,1,1,0,1,0,0,1),
Valid=c(1,1,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,1,1,1))
testperson<-'b'
comparison<-df%>%
transmute(Your_Rate=sum(Success[Person==testperson])/sum(Valid[Person==testperson]),
Baseline=sum(Success[Person!=testperson])/sum(Valid[Person!=testperson]))%>%
distinct(Your_Rate,.keep_all = TRUE)
plot_ly(comparison,y=~Your_Rate, type='bar')%>%
add_trace(y=~Baseline)
```
As I have it structured, a variable in the .Rmd defines the person that the calculations are being made for, and the only way I'm taking care of filenames is by manually saving a file with the person's name before knitting.
My guess at how to accomplish this is to:
Produce .Rmd files with the names of each person appended,(i.e PerfReport_a.Rmd, PerfReport_b.Rmd) and change the testperson variable in each file.
Run a script that knits each of these .Rmd files into a set of html files.
If I'm right about the general steps here, I have step 2 covered with a .R file that knits every .Rmd file within a directory.
files<-list.files("E:/Dashboards/",pattern = "[.]Rmd$")
files2<-as.data.frame(files)
files3<-files2%>%
mutate(filenow=paste0("E:/Dashboards/",files))
files4<-files3$filenow
for (f in files4) rmarkdown::render(f)
Any help with a means of generating the .Rmd files with the adjusted values for testperson would be greatly appreciated. If there is simply a better means of getting from a single, master .Rmd to the hundreds of personalized .html dashboards I'm looking to produce, I would love to learn that method, too!
Thank you in advance.
You can use yaml parameter params to get this right. You will need
one template (template.Rmd)
script to run the template (script.R)
Here is a small trivial example:
template.Rmd
---
title: no title
author: Roman
date: "`r Sys.Date()`"
params:
testperson: NA
---
```{r}
print(params$testperson)
```
```{r}
sessionInfo()
```
script.R
library(rmarkdown)
persons <- c("person1", "person2", "person3")
for (person in persons) {
rmarkdown::render(input = "template.Rmd",
output_file = sprintf("%s_report.html", person),
params = list(testperson = person)
)
}
This should populate your working folder with personX_report.html files (see screenshot below).
I want to use knitr to format an R markdown file, lets call it Main.rmd. Some code in Main.rmd relies on helper functions in a second file, lets call it Functions.rmd.
When I first run Functions.rmd and then Main.rmd, the code in Main.rmd runs fine. When I first run Functions.rmd and then try to knit Main.rmd, I receive an evaluation:
Error "Object 'myfunction' not found
How can I fix this without combining Main.rmd and Functions.rmd into a single document, which I would like to avoid doing?
Edit: I've added a toy example below. There are very useful suggestions so far for how to call the functions in Functions.rmd from Main.rmd, but they all require converting Functions.rmd to a .R file. However, for my current purpose, it is important that Functions.rmd can also be read as a standalone markdown document.
First, Main.rmd:
---
title: "Main_test"
author: "Matt Nolan"
date: "25/06/2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background.
This is the main body of text and code used to display results of analyses, some of which are created by calling functions in Functions.Rmd.
```{r cars}
myexamplefunction(1,2)
```
And, here is Functions.rmd:
---
title: "Functions_test"
author: "Matt Nolan"
date: "25/06/2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background
This is a document containing functions used in the document "Main_test".
Because it contains functions and formatted text to explain the functions for an interested reader, it should be usable as a standalone markdown document.
For example, this is a function that adds two numbers.
```{r cars}
myexamplefunction <- function(a, b) {a + b}
```
30Jun2018 Update: R Markdown does not support combining Rmd files
Matt's 25Jun2018 update clarifies the question, asking how to embed one Rmd document in another Rmd document. Per the R Markdown website, R Markdown requires a single Rmd file. It does not currently support the embedding of one Rmd file within another Rmd document.
That said, with the bookdown package, you could structure the Rmd files as chapters in a book, where each Rmd file is a chapter in the book. For details, see Bookdown: Authoring Books with R Markdown 1.4 - Two Rendering Approaches, the Getting Started page, and the Bookdown Demo github repository for an example book built in bookdown.
25Jun2018 Update: Printing the code in an Appendix
Per the comments from the OP, the reason for including the functions in an Rmd file instead of an R file was to obtain a formatted printout of the code in an Appendix. This is possible with the technique I originally posted plus a few changes.
Use named chunks to put the code in an appendix, and use the arguments echo=TRUE and eval=FALSE to avoid executing it multiple times.
Execute code from the Appendix in main flow of the document by way of the ref.label= argument, and keep the code from printing in the main document with the echo=FALSE argument.
In addition to using the source() function, one must print each function in another chunk in the appendix in order to obtain formatted print of each function.
An updated version of my example Rmd file is listed below.
---
title: "TestIncludedFiles"
author: "Len Greski"
date: "June 24, 2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background
A question was posted on [Stackoverflow](https://stackoverflow.com/questions/51013924/calling-functions-in-a-second-file-when-compiling-rmd-files-with-knitr) about how to include functions from one Rmd file while knitting another.
If the second file contains R functions to be accessed in the second Rmd file, they're best included as R files rather than Rmd. In this example we'll include three files of functions from the Johns Hopkins University *R Programming* course: `pollutantmean()`, `corr()`, and `complete()`. We'll execute them in a subsequent code block.
After an update to the original post where the original poster noted that he included the functions in an Rmd file in order to provide a formatted printout of the code in the report as an appendix, I've modified this example to account for this additional requirement.
```{r ref.label="sourceCode",echo=FALSE}
# execute sourceCode chunk from appendix
```
## Executing the sourced files
Now that the required R functions have been sourced, we'll execute them.
```{r runCode, echo=TRUE}
pollutantmean("specdata","nitrate",70:72)
complete("specdata",1:10)
corr("specdata",threshold=500)
```
# Appendix
```{r sourceCode,echo=FALSE,eval=FALSE}
# use source() function to source the functions we want to execute
source("./rprogramming/oneLine_pollutantmean.r")
source("./rprogramming/oneLine_complete.r")
source("./rprogramming/oneLine_corr.r")
```
The following is an inventory of the functions used in this Rmd file.
```{r }
pollutantmean
complete
corr
```
...and the output for the Appendix section of the document (with redactions to avoid publishing answers to a class programming assignment).
Original Answer
If the second Rmd file only contains functions, you're better off saving them as an R file and using source() to include them in Main.Rmd. For example:
date: "June 24, 2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background
A question was posted on [Stackoverflow](https://stackoverflow.com/questions/51013924/calling-functions-in-a-second-file-when-compiling-rmd-files-with-knitr) about how to include functions from one Rmd file while knitting another.
If the second file contains R functions to be accessed in the second Rmd file, they're best included as R files rather than Rmd. In this example we'll include three files of functions from the Johns Hopkins University *R Programming* course: `pollutantmean()`, `corr()`, and `complete()`. We'll execute them in a subsequent code block.
```{r sourceCode,echo=TRUE}
# use source() function to source the functions we want to execute
source("./rprogramming/pollutantmean.r")
source("./rprogramming/complete.r")
source("./rprogramming/corr.r")
```
## Executing the sourced files
Now that the required R functions have been sourced, we'll execute them.
```{r runCode, echo=TRUE}
pollutantmean("specdata","nitrate",70:72)
complete("specdata",1:10)
corr("specdata",threshold=500)
```
...produces the following output:
DISCLOSURE: This answer includes techniques that I previously posted as a blog article in 2016, ToothGrowth Assignment: Accessing R Code from an Appendix in Knitr.
I have been reading about R Markdown (here, here, and here) and using it to create solid reports. I would like to try to use what little code I am running to do some ad hoc analyses and turn them into more scalable data reports.
My question is rather broad: Is there a proper way to organize your code around an R Markdown project? Say, have one script that generates all of the data structures?
For example: Let's say that I have the cars data set and I have brought in commercial data on the manufacturer. What if I wanted to attach the manufacturer to the current cars data set, and then produce a separate summary table for each company using a manipulated data set cars.by.name as well as plot a certain sample using cars.import?
EDIT: Right now I have two files open. One is an R Script file that has all of the data manipulation: subsetting and re-categorizing values. And the other is the R Markdown file where I am building out text to accompany the various tables and plots of interest. When I call an object from the R Script file--like:
```{r}
table(cars.by.name$make)
```
I get an error saying Error in summary(cars.by.name$make) : object 'cars.by.name' not found
EDIT 2: I found this older thread to be helpful. Link
---
title: "Untitled"
author: "Jeb"
date: "August 4, 2015"
output: html_document
---
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r}
table(cars.by.name$make)
```
```{r}
summary(cars)
summary(cars.by.name)
```
```{r}
table(cars.by.name)
```
You can also embed plots, for example:
```{r, echo=FALSE}
plot(cars)
plot(cars.import)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
There is a solution for this sort of problem, explained here.
Basically, if you have an .R file containing your code, there is no need to repeat the code in the .Rmd file, but you can include the code from .R file. For this to work, the chunks of code should be named in the .R file, and then can be included by name in the .Rmd file.
test.R:
## ---- chunk-1 ----
table(cars.by.name$make)
test.Rmd
Just once on top of the .Rmd file:
```{r echo=FALSE, cache= F}
knitr::read_chunk('test.R')
```
For every chunk you're including (replace chunk-1 with the label of that specific chunk in your .R file):
```{r chunk-1}
```
Note that it should be left empty (as is) and in run-time your code from .R will be brought over here and run.
Often times, I have many reports that need to run the same code with slightly different parameters. Calling all my "stats" functions separately, generating the results and then just referencing is what I typically do. The way to do this is as follows:
---
title: "Untitled"
author: "Author"
date: "August 4, 2015"
output: html_document
---
```{r, echo=FALSE, message=FALSE}
directoryPath <- "rawPath" ##Something like /Users/userid/RDataFile
fullPath <- file.path(directoryPath,"myROutputFile.RData")
load(fullPath)
```
Some Text, headers whatever
```{r}
summary(myStructure$value1) #Where myStructure was saved to the .RData file
```
You can save an RData file by using the save.image() command.
Hope that helps!
One of the features I like very much in Sweave is the option to have \SweaveInput{} of separate Sweave files to have a more "modular" report and just be able to comment out parts of the report that I do not want to be generated with a single #\SweaveInput{part_x} rather than having to comment in or out entire blocks of code.
Recently I decided to move to R Markdown for multiple reasons being mainly practicality, the option of interactive (Shiny) integration in the report and the fact that I do not really need the extensive formatting options of LaTeX.
I found that technically pandoc is able to combine multiple Rmd files into one html output by just concatenating them but it would be nice if this behaviour could be called from a "master" Rmd file.
Any answer would be greatly appreciated even if it is just "go back to Sweave, it is not possible in Markdown".
I am using R 3.1.1 for Windows and Linux as well as Rstudio 0.98.1056 and Rstudio server 0.98.983.
Use something like this in the main document:
```{r child="CapsuleRInit.Rmd"}
```
```{r child="CapsuleTitle.Rmd", eval=TRUE}
```
```{r child="CapsuleBaseline.Rmd", eval=TRUE}
```
Use eval=FALSE to skip one child.
For RStudio users: you can define a main document for latex output, but this does not work for RMD documents, so you always have to switch to the main document for processing. Please support my feature request to RStudio; I tried already twice, but is seems to me that too few people use child docs to put it higher in the priority list.
I don't quite understand some of the terms in the answer above, but the solution relates to defining a custom knit: hook in the YAML header. For multipartite documents this allows you to, for example:
Have a 'main' or 'root' Rmarkdown file with an output: markdown_document YAML header
render all child documents from Rmd ⇒ md ahead of calling render, or not if this is time-limiting
combine multiple files (with the child code chunk option) into one (e.g. for chapters in a report)
write output: html_document (or other format) YAML headers for this compilation output on the fly, prepending to the markdown effectively writing a fresh Rmarkdown file
...then render this Rmarkdown to get the output, deleting intermediate files in the process if desired
The code for all of the above (dumped here) is described here, a post I wrote after working out the usage of custom knit: YAML header hooks recently (here).
The custom knit: function (i.e. the replacement to rmarkdown::render) in the above example is:
(function(inputFile, encoding) {
input.dir <- normalizePath(dirname(inputFile))
rmarkdown::render(input = inputFile, encoding = encoding, quiet=TRUE,
output_file = paste0(input.dir,'/Workbook-tmp.Rmd'))
sink("Workbook-compiled.Rmd")
cat(readLines(headerConn <- file("Workbook-header.yaml")), sep="\n")
close(headerConn)
cat(readLines(rmdConn <- file("Workbook-tmp.Rmd")), sep="\n")
close(rmdConn)
sink()
rmarkdown::render(input = paste0(input.dir,'/Workbook-compiled.Rmd'),
encoding = encoding, output_file = paste0(input.dir,'/../Workbook.html'))
unlink(paste0(input.dir,'/Workbook-tmp.Rmd'))
})
...but all squeezed onto 1 line!
The rest of the 'master'/'root'/'control' file or whatever you want to call it takes care of writing the aforementioned YAML for the final HTML output that goes via an intermediate Rmarkdown file, and its second code chunk programmatically appends child documents through a call to list.files()
```{r include=FALSE}
header.input.file <- "Workbook-header.yaml"
header.input.dir <- normalizePath(dirname(header.input.file))
fileConn <- file(header.input.file)
writeLines(c(
"---",
paste0('title: "', rmarkdown::metadata$title,'"'),
"output:",
" html_document:",
" toc: true",
" toc_depth: 3 # defaults to 3 anyway, but just for ease of modification",
" number_sections: TRUE",
paste0(" css: ",paste0(header.input.dir,'/../resources/workbook_style.css')),
' pandoc_args: ["--number-offset=1", "--atx-headers"]',
"---", sep="\n"),
fileConn)
close(fileConn)
```
```{r child = list.files(pattern = 'Notes-.*?\\.md')}
# Use file names with strict numeric ordering, e.g. Notes-001-Feb1.md
```
The directory structure would contain a top-level folder with
A final output Workbook.html
A resources subfolder containing workbook_style.css
A documents subfolder containing said main file "Workbook.Rmd" alongside files named as "Notes-001.Rmd", "Notes-002.Rmd" etc. (to ensure a fileglobbing on list.files(pattern = "Notes-.*?\\.Rmd) finds and thus makes them children in the correct order when rendering the main Workbook.Rmd file)
To get proper numbering of files, each constituent "Notes-XXX.Rmd" file should contain the following style YAML header:
---
title: "March: Doing x, y, and z"
knit: (function(inputFile, encoding) { input.dir <- normalizePath(dirname(inputFile)); rmarkdown::render(input = inputFile, encoding = encoding, quiet=TRUE)})
output:
md_document:
variant: markdown_github
pandoc_args: "--atx-headers"
---
```{r echo=FALSE, results='asis', comment=''}
cat("##", rmarkdown::metadata$title)
```
The code chunk at the top of the Rmarkdown document enters the YAML title as a second-level header when evaluated. results='asis' indicates to return plain text-string rather than
[1] "A text string"
You would knit each of these before knitting the main file - it's easier to remove the requirement to render all child documents and just append their pre-produced markdown output.
I've described all of this at the links above, but I thought it'd be bad manners not to leave the actual code with my answer.
I don't know how effective that RStudio feature request website may be... Personally I've not found it hard to look into the source code for these functions, which thankfully are open source, and if there really is something absent rather than undocumented an inner-workings-informed feature request is likely far more actionable by one of their software devs.
I'm not familiar with Sweave, was the above was what you were aiming at? If I understand correctly you just want to control the inclusion of documents in a modular fashion. The child = list.files() statement could take care of that: if not through file globbing you can straight-up list files as child = c("file1.md","file2md")... and switch that statement to change the children. You can also control TRUE/FALSE switches with YAML, whereby the presence of a custom header would set some children to be included for example
potentially.absent.variable: TRUE
...above the document with a silent include=FALSE hiding the machinations of the first chunk:
```{r include=FALSE}
!all(!as.logical(rmarkdown::metadata$potentially.absent.variable)
# ==> FALSE if potentially.absent.variable is absent
# ==> FALSE if anything other than TRUE
# ==> TRUE if TRUE
checkFor <- function(var) {
return !all(!as.logical(rmarkdown::metadata[[var]])
}
```
```{r child = "Optional_file.md", eval = checkFor(potentially.absent.variable)}
```