Inserting code chunks from one Rmarkdown document into another - r

I have been running some small R tutorials / workshops for which I keep my 'challenge scripts' in Rmarkdown documents. These contain free text and R-code blocks. Some of the code blocks are prefilled (eg, to set up datasets for later use), whereas some are there for the attendees to fill-in code during the workshop.
For each challenge script, I have a solution script. The latter contains all the free text of the former, but any challenge-blocks have been filled in (there's an example of a solutions workbook here).
I don't really want to keep two closely related copies of the same file (the challenge and the solutions workbook). So I was wondering if there's an easy way to construct my challenge scripts from my solutions scripts (or the solutions script from a challenge-script and an R-script containing just the solution blocks).
For example, is there an easy way to replace all named code-blocks in one Rmarkdown file with the correspondingly-named code block from another rmarkdown file?
That is, if I have
challenge.Rmd
HEADER
INTRODUCTION
Today we're going to learn about sampling pseudo-random numbers in R
```{r challenge_1}
# Challenge 1: Make a histogram of 100 randomly-sampled
# normally-distributed values
```
BLAH BLAH
END_OF_FILE
solutions.Rmd
HEADER
```{r challenge_1}
# Challenge 1: Make a histogram of 100 randomly-sampled
# normally-distributed values
hist(rnorm(100))
```
END_OF_FILE
How do I replace challenge_1 from challenge.Rmd with challenge_1 from solutions.Rmd?
All the best

This is one approach:
challenge.Rmd
---
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
show_solution <- FALSE
```
```{r child="solution.Rmd", eval=show_solution}
```
Today we're going to learn about sampling pseudo-random numbers in R
```{r challenge_1}
# Challenge 1: Make a histogram of 100 randomly-sampled
# normally-distributed values
```
```{r challenge_1_s, eval=show_solution, echo=show_solution}
```
```{r challenge_2}
# Challenge 2: Make a histogram of 100 randomly-sampled
# uniform-distributed values
```
```{r challenge_2_s, eval=show_solution, echo=show_solution}
```
solution.Rmd
```{r challenge_1_s, eval=FALSE, echo=FALSE}
# Challenge 1: Make a histogram of 100 randomly-sampled
# normally-distributed values
hist(rnorm(100))
```
```{r challenge_2_s, eval=FALSE, echo=FALSE}
# Challenge 2: Make a histogram of 100 randomly-sampled
# uniform-distributed values
hist(runif(100))
```
With the show_solution parameter you can include or exclude the solution from you rmarkdown. The participants are not able to compile the document for show_solution = TRUE unless they have the solution.Rmd. For show_solution = FALSE there's no problem and it compiles nicely.

Related

Rmarkdown backticks inside inline code / inconsistent behavior with usual code chunks

This works in a usual code chunk in R markdown:
m1_aov <- anova(m1)
m1_aov$`Sum Sq`[2] %>% round(3)
Unfortunately, using the latter in inline code breaks the knitr parser down
`r m1_aov$`Sum Sq`[2] %>% round(3)`
Indeed, it also breaks Stackoverflow.
I looked at this related question but could not infer a working solution to my problem. Any hint?
Expanding the comment with a working example:
---
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
```{r}
a <- tibble::tibble(`a column` = 1:10) # using tibble to get a column name with a white space
m <- mean(a$`a column`)
```
Mean is `r m`
To me this looks like a neat trick because it avoids to include unecessary long code inside the text, and do not create the problem you are facing at the (small) cost of creating new objects.
The output:

How to set global variables in R Markdown?

I have a markdown file with 4 chunks of code, but all chunks depend on the same initial database. When I run each chunk separately no problem happens, but when I knit, the problem begins: the chunks don't identify the variables and dataset already in the environment.
I solve this problem by loading all the datasets again, but it's not efficient.
How I turn all the variables in the first chunk into global variables, i.e., available to all chunks.
```{r, echo=FALSE, message=FALSE, results='hide', warning=FALSE}
library(dplyr)
library(zoo)
library(tidyr)
variable1
variable2
```{r, echo=FALSE, message=FALSE, results='hide', warning=FALSE}
variable1 %>%
ggplot2(aes(x = date, y = whatever) +
geom_line()
For example, I have variables in the first chunk that will be plotted in another chunk. But these variables for some reason are not available.
Another problem: I have to load the packages in each chunk.
I appreciate it if someone can help!
You generally just use an initial block commonly labeled setup. And it does not even have to be a 'global' (in the sense of <<- assignment) variable: subsequent chunks are aware of earlier chunks.
Code
---
title: demo
---
```{r setup, echo=FALSE}
suppressMessages(library(zoo))
startDate <- as.Date("2020-01-01")
```
Some text.
```{r code}
data <- zoo(1:3, order.by=startDate + 0:2)
print(data)
```
Output
With apologies for a screenshot

R Markdown: Importing R script objects

I have an R code that generates several plots and tables from my data. I want to write a report in Rmarkdown in which i want to include just the plots and the tables without rewriting the R code. One way is to use 'read_chunk' function but it does not serve my purpose. Following is a simple example of what i need.
Suppose I have the following R script 'Example.r'
x <- 1:4
y <- sin(x)
####----Table
table <- cbind(x,y)
####----Plot
plot_1 <- plot(x,y)
Now in my R markdown file i want the following:
```{r echo=FALSE, cache=FALSE}
knitr::read_chunk('Example.r')
```
The following table shows the results:
```{r Table, echo=FALSE}
```
One can depict the result in a plot as well:
```{r Plot, echo=FALSE}
```
In the above example, i will not be able to insert either the table or the plot, since both need the input 'x' and 'y' to be defined before the table and the plot commands are run. Is there a way to implement this without hardcoding 'x' and 'y' two times?
Instead of sourcing an R script, here is my workflow which might be useful to you:
(sorry I feel it should be a comment but it gets a bit lengthy)
Keep a draft.rmd and a report.rmd side by side. the draft.rmd will be your workplace with exploratory data analysis. and the report.rmd will be your polished report
Gather results (like data.frames & ggplot objects) you want to put in the report in a list. Save the list as a result_181023.rda file. in a folder like data/
Load the saved result_181023.rda file in the report.rmd, draw your figures & print your tables and polish your report the way you like.
An example:
```{r data echo=FALSE, cache=FALSE}
# a list named result.list
# With a table result.list$df
# and a ggplot object: result.list$gg1
load("data/result_181023.rda")
```
The following table shows the results:
```{r Table, echo=FALSE}
knitr::kable(result.list$df)
```
One can depict the result in a plot as well:
```{r Plot, echo=FALSE}
result.list$gg1
```

Rmd/Bookdown sharing sections/etc between Rmd documents

You have 2 documents: 1) Commercial and 2) Technical
The two should never diverge in the following sense:
Anything in the Commercial document must, verbatim, be in the technical document.
Is anything like the following possible in R by using Rmd/Bookdown?
In the spirit of "writing the code you wish you had":
Commercial.Rmd
```{r child = 'chapter_01.Rmd'}
```
Ch_01.Rmd:
## Section A
This is the sales by quarter:
```{r sales_by_quarter, results='asis'}
```
Technical.Rmd
```{r child = 'chapter_01_technical.Rmd'}
```
chapter_01_technical.Rmd:
</ Some magic to pull in 'Section A' from 'chapter_01.Rmd' \>
### Technical subsection
We define sales according to .... and handle missing data ...
```r sales_missing_data_summary
```
I am aware of the following:
Including a document in another
```{r child = 'chapter1.Rmd'}
```
Including R code
```{r shared_code}
```
Where the code is defined in shared.R:
#shared.R
#This code snippet is included in multiple places
## #knitr shared_code
a <- 10
b <- 0.05

Creating summaries at the top of a knitr report that use variables that are defined later

Is there a standard way to include the computed values from variables early on in the written knitr report before those values are computed in the code itself? The purpose is to create an executive summary at the top of the report.
For example, something like this, where variable1 and variable2 are not defined until later:
---
title: "Untitled"
output: html_document
---
# Summary
The values from the analysis are `r variable1` and `r variable2`
## Section 1
In this section we compute some values. We find that the value of variable 1 is `r variable1`
```{r first code block}
variable1 <- cars[4, 2]
```
## Section 2
In this section we compute some more values. In this section we compute some values. We find that the value of variable 2 is `r variable2`
```{r second code block}
variable2 <- cars[5, 2]
```
A simple solution is to simply knit() the document twice from a fresh Rgui session.
The first time through, the inline R code will trigger some complaints about variables that can't be found, but the chunks will be evaluated, and the variables they return will be left in the global workspace. The second time through, the inline R code will find those variables and substitute in their values without complaint:
knit("eg.Rmd")
knit2html("eg.Rmd")
## RStudio users will need to explicitly set knit's environment, like so:
# knit("eg.Rmd", envir=.GlobalEnv)
# knit2html("eg.Rmd", envir=.GlobalEnv)
Note 1: In an earlier version of this answer, I had suggested doing knit(purl("eg.Rmd")); knit2html("eg.Rmd"). This had the (minor) advantage of not running the inline R code the first time through, but has the (potentially major) disadvantage of missing out on knitr caching capabilities.
Note 2 (for Rstudio users): RStudio necessitates an explicit envir=.GlobalEnv because, as documented here, it by default runs knit() in a separate process and environment. It default behavior aims to avoid touching anything in global environment, which means that the first run won't leave the needed variables lying around anywhere that the second run can find them.
Here is another approach, which uses brew + knit. The idea is to let knitr make a first pass on the document, and then running it through brew. You can automate this workflow by introducing the brew step as a document hook that is run after knitr is done with its magic. Note that you will have to use brew markup <%= variable %> to print values in place.
---
title: "Untitled"
output: html_document
---
# Summary
The values from the analysis are <%= variable1 %> and
<%= variable2 %>
## Section 1
In this section we compute some values. We find that the value of variable 1
is <%= variable1 %>
```{r first code block}
variable1 = cars[6, 2]
```
## Section 2
In this section we compute some more values. In this section we compute
some values. We find that the value of variable 2 is <%= variable2 %>
```{r second code block}
variable2 = cars[5, 2]
```
```{r cache = F}
require(knitr)
knit_hooks$set(document = function(x){
x1 = paste(x, collapse = '\n')
paste(capture.output(brew::brew(text = x1)), collapse = '\n')
})
```
This has become pretty easy using the ref.label chunk option. See below:
---
title: Report
output: html_document
---
```{r}
library(pixiedust)
options(pixiedust_print_method = "html")
```
### Executive Summary
```{r exec-summary, echo = FALSE, ref.label = c("model", "table")}
```
Now I can make reference to `fit` here, even though it isn't yet defined in the script. For example, a can get the slope for the `qsec` variable by calling `round(coef(fit)[2], 2)`, which yields 0.93.
Next, I want to show the full table of results. This is stored in the `fittab` object created in the `"table"` chunk.
```{r, echo = FALSE}
fittab
```
### Results
Then I need a chunk named `"model"` in which I define a model of some kind.
```{r model}
fit <- lm(mpg ~ qsec + wt, data = mtcars)
```
And lastly, I create the `"table"` chunk to generate `fittab`.
```{r table}
fittab <-
dust(fit) %>%
medley_model() %>%
medley_bw() %>%
sprinkle(pad = 4,
bg_pattern_by = "rows")
```
I work in knitr, and the following two-pass system works for me. I have two (invisible) code chunks, one at the top and one at the bottom. The one at the bottom saves the values of any variables I need to include in the text before they are actually computed in a file (statedata.R). The top chunk sets the variable values to something that stands out if they haven't been defined yet, and then (if it exists) it grabs the actual values from the stored file.
The script needs to be knit twice, as values will be available only after one pass through. Note that the second chunk erases the saved state file at the end of the second pass, so that any later changes to the script that affect the saved variables will have to be computed anew (so that we don't accidentally report old values from an earlier run of the script).
---
title: "Untitled"
output: html_document
---
```{r, echo=FALSE, results='hide'}
# grab saved computed values from earlier passes
if (!exists("variable1")) {
variable1 <- "UNDEFINED"
variable2 <- "UNDEFINED"
if (file.exists("statedata.R")) {
source("statedata.R")
}
}
# Summary
The values from the analysis are `r variable1` and `r variable2`
## Section 1
In this section we compute some values. We find that the value of variable 1 is `r variable1`
```{r first code block}
variable1 <- cars[4, 2]
```
## Section 2
In this section we compute some more values. In this section we compute some values. We find that the value of variable 2 is `r variable2`
```{r second code block}
variable2 <- cars[5, 2]
```
```{r save variables for summary,echo=FALSE,results='hide'}
if (!file.exists("statedata.R")) {
dump(c("variable1","variable2"), file="statedata.R")
} else {
file.remove("statedata.R")
}
```
Latex macros can solve this problem. See this answer to my related question.
\newcommand\body{
\section{Analysis}
<<>>=
x <- 2
#
Some text here
} % Finishes body
\section*{Executive Summary}
<<>>=
x
#
\body

Resources