RMarkdown render to Notebook with child chunks included - r

I'm after a way to render an Rmd document (that contains references to various "child" files) to a self-contained R Notebook without these dependencies.
At the moment, the .Rmd code chunks are located throughout a number of .R, .py and .sql files and are referenced in the report using
```{r extraction, include=FALSE, cache=FALSE}
knitr::read_chunk("myscript.R")
```
followed by
```{r chunk_from_myscript}
```
as documented here.
I've done this to avoid code duplication and to allow for running the source files separately however these code chunks are only executable in the report via a call to knit or render (when read_chunk is run and the code chunk is available).
Is there a way to spin-off an Rmd (prior to knitting) with
just these chunks populated?
This function
rmarkdown::render("report.Rmd", clean = FALSE)
almost gets there as it leaves the markdown files behind whilst removing extraction and populating chunk_from_myscript however as these files are straight markdown, the chunks are no longer executable and the chunk options are missing. It obviously also doesn't include chunks where eval=TRUE, echo=FALSE which would be needed to run the resulting notebook.
I've also looked at knitr::spin however this would mean disseminating the contents of the report to every source file and isn't terribly ideal.
Reprex
report.Rmd
---
title: 'Report'
---
```{r read_chunks, include=FALSE, cache=FALSE}
knitr::read_chunk("myscript.R")
```
Some documentation
```{r chunk_from_myscript}
```
Some more documentation
```{r chunk_two_from_myscript, eval=TRUE, echo=FALSE}
```
myscript.R
#' # MyScript
#'
#' This is a valid R source file which is formatted
#' using the `knitr::spin` style comments and code
#' chunks.
#' The file's code can be used in large .Rmd reports by
#' extracting the various chunks using `knitr::read_chunk` or
#' it can be spun into its own small commented .Rmd report
#' using `knitr::spin`
# ---- chunk_from_myscript
sessionInfo()
#' This is the second chunk
# ---- chunk_two_from_myscript
1 + 1
Desired Output
notebook.Rmd
---
title: 'Report'
---
Some documentation
```{r chunk_from_myscript}
sessionInfo()
```
Some more documentation
```{r chunk_two_from_myscript, eval=TRUE, echo=FALSE}
1 + 1
```

Working through your reprex I now better understand the issue you are trying to solve. You can knit into an output.Rmd to merge your report and scripts into a single markdown file.
Instead of using knitr::read_chunk, I've read in with knitr::spin to cat the asis output into another .Rmd file. Also note the params$final flag to allow rendering the final document when set as TRUE or allowing the knit to an intermediate .Rmd as FALSE by default.
report.Rmd
---
title: "Report"
params:
final: false
---
```{r load_chunk, include=FALSE}
chunk <- knitr::spin(text = readLines("myscript.R"), report = FALSE, knit = params$final)
```
Some documentation
```{r print_chunk, results='asis', echo=FALSE}
cat(chunk, sep = "\n")
```
to produce the intermediate file:
rmarkdown::render("report.Rmd", "output.Rmd")
output.Rmd
---
title: "Report"
---
Some documentation
```{r chunk_from_myscript, echo=TRUE}
sessionInfo()
```
With the secondary output.Rmd, you could continue with my original response below to render to html_notebook so that the document may be shared without needing to regenerate but still containing the source R markdown file.
To render the final document from report.Rmd you can use:
rmarkdown::render("report.Rmd", params = list(final = TRUE))
Original response
You need to include additional arguments to your render statement.
rmarkdown::render(
input = "output.Rmd",
output_format = "html_notebook",
output_file = "output.nb.html"
)
When you open the .nb.html file in RStudio the embedded .Rmd will be viewable in the editing pane.

Since neither knitr::knit nor rmarkdown::render seem suited to rendering to R markdown, I've managed to somewhat work around this by dynamically inserting the chunk text into each empty chunk and writing that to a new file:
library(magrittr)
library(stringr)
# Find the line numbers of every empty code chunk
get_empty_chunk_line_nums <- function(file_text){
# Create an Nx2 matrix where the rows correspond
# to code chunks and the columns are start/end line nums
mat <- file_text %>%
grep(pattern = "^```") %>%
matrix(ncol = 2, byrow = TRUE)
# Return the chunk line numbers where the end line number
# immediately follows the starting line (ie. chunk is empty)
empty_chunks <- mat[,1] + 1 == mat[,2]
mat[empty_chunks, 1]
}
# Substitute each empty code chunk with the code from `read_chunk`
replace_chunk_code <- function(this_chunk_num) {
this_chunk <- file_text[this_chunk_num]
# Extract the chunk alias
chunk_name <- stringr::str_match(this_chunk, "^```\\{\\w+ (\\w+)")[2]
# Replace the closing "```" with "<chunk code>\n```"
chunk_code <- paste0(knitr:::knit_code$get(chunk_name), collapse = "\n")
file_text[this_chunk_num + 1] %<>% {paste(chunk_code, ., sep = "\n")}
file_text
}
render_to_rmd <- function(input_file, output_file, source_files) {
lapply(source_files, knitr::read_chunk)
file_text <- readLines(input_file)
empty_chunks <- get_empty_chunk_line_nums(file_text)
for (chunk_num in empty_chunks){
file_text <- replace_chunk_code(file_text, chunk_num)
}
writeLines(file_text, output_file)
}
source_files <- c("myscript.R")
render_to_rmd("report.Rmd", "output.Rmd", source_files)
This has the added benefits of preserving chunk options and working
with Python and SQL chunks too since there is no requirement to evaluate
any chunks in this step.

Related

Split Rmarkdown (.Rmd) file into mutiple files (.Rmd)

I made a big Rmarkdown (.Rmd) file that contains only chunks of R codes. I'm wondering if there's a way to automatically generate a (.Rmd) file for every chunk or if I have to manually create a (.Rmd) file for every chunk I have (over 200 chunks).
Let's say I have this (.Rmd) file :
---
title: "Untitled"
output: html_document
date: '2022-08-13'
---
R Markdown
``{r cars}
summary(cars)
``
Including Plots
``{r pressure, echo=FALSE}
plot(pressure)
``
Is there a way to generate automatically a (.Rmd) file for the chuck r cars and another for the chunk r pressure?
Thanks in advance.
A possible solution is to use package parsemd:
library(parsermd)
rmd <- parse_rmd("x.Rmd")
rlabels <- c("cars", "pressure")
yaml <- rmd_select(rmd, has_type("rmd_yaml_list"))
for (i in rlabels)
{
rchunk <- rmd_select(rmd, all_of(i))
sink(paste0(i, ".Rmd"))
cat(as_document(yaml), sep = "\n")
cat(as_document(rchunk), sep = "\n")
sink()
}

When I knit a flextable to html there is no output with the table

The table prints nicely in markdown but is not present in the knitted html file. I noticed that it is classified as a list but don't know how to change it to an acceptable file type. The knitted output is not formatted as a table. I appreciate the help.
library("crosstable") #important package crosstable() function
library('dplyr')
library("flextable")
tbl1 = crosstable(mtcars2, c(1), by = 2) %>%
as_flextable(keep_id=FALSE)
print(tbl1)
According to ?print.flextable
Note also that a print method is used when flextable are used within R markdown documents. See knit_print.flextable().
Therefore, if we want to print in Rmarkdown, either use knitr::knit_print or remove the print as the ?knit_print.flextable documentation shows
You should not call this method directly. This function is used by the knitr package to automatically display a flextable in an "R Markdown" document from a chunk.
---
title: "Testing"
author: "akrun"
date: "09/12/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r, echo = FALSE}
suppressPackageStartupMessages(library("crosstable")) #important package crosstable() function
suppressPackageStartupMessages(library('dplyr'))
suppressPackageStartupMessages(library("flextable"))
tbl1 = crosstable(mtcars2, c(1), by = 2) %>%
as_flextable(keep_id=FALSE)
# either use knit_print or remove the print wrapper
#knitr::knit_print(tbl1)
tbl1
```
-output

How to generate rmarkdown chunk within a chunk

I'm trying to generate a rmarkdown chunk using code. I've read similar questions and their solutions, such as using pander or using cat. I just can't seem to generate it. I also tried knitting the output manually. Here's an example of a Rmd file:
---
title: "test"
output: pdf_document
---
## R Markdown
```{r, results='asis',echo=FALSE}
txt <- paste("```{r}",
"2+2",
"```")
pander::pander(txt)
```
When I knit this, I get a verbatim {r} 2+2. I would like to see the number four instead. In my real example, I'm using bookdown and trying to generate a block2 chunk.
Any ideas how to generate this chunk that gets evaluated as R code?
Does this do what you want?
## R Markdown
```{r, results='asis',echo=FALSE}
txt <- paste("```{r}",
2+2,
"```")
pander::pander(txt)
```
This evalutates to
```{r} 4 ```
in your markdown document.
You using a string literal "2+2" as opposed to the expression 2+2. This is the first issue, I guess.
If you want it correctly parsed you need to add an sep = "\n" argument to paste or manually add the newline breaks.
I.e.
## R Markdown
```{r, results='asis',echo=FALSE}
txt <- paste("```{r}\n",
2+2,
"\n```", sep = "")
pander::pander(txt)
```
This evalutates to
```{r}
4
```
which is interpreted as R code in the markdown document.

rstudio hangs and aborts with rmarkdown loop

I have several datasets each of which have a common grouping factor. I want to produce one large report with separate sections for each grouping factor. Therefore I want to re-run a set of rmarkdown code for each iteration of the grouping factor.
Using the following approach from here doesnt work for me. i.e.:
---
title: "Untitled"
author: "Author"
output: html_document
---
```{r, results='asis'}
for (i in 1:2){
cat('\n')
cat("#This is a heading for ", i, "\n")
hist(cars[,i])
cat('\n')
}
```
Because the markdown I want to run on each grouping factor does not easily fit within one code chunk. The report must be ordered by grouping factor and I want to be able to come in and out of code chunks for each iteration over grouping factor.
So I went for calling an Rmd. with render using a loop from an Rscript for each grouping factor as found here:
# run a markdown file to summarise each one.
for(each_group in the_groups){
render("/Users/path/xx.Rmd",
output_format = "pdf_document",
output_file = paste0(each_group,"_report_", Sys.Date(),".pdf"),
output_dir = "/Users/path/folder")
}
My plan was to then combine the individual reports with pdftk. However, when I get to the about the 5th iteration my Rstudio session hangs and eventually aborts with a fatal error. I have ran individually the Rmd. for the grouping factors it stops at which work fine.
I tested some looping with the following simple test files:
.R
# load packages
library(knitr)
library(markdown)
library(rmarkdown)
# use first 5 rows of mtcars as example data
mtcars <- mtcars[1:5,]
# for each type of car in the data create a report
# these reports are saved in output_dir with the name specified by output_file
for (car in rep(unique(rownames(mtcars)), 100)){
# for pdf reports
rmarkdown::render(input = "/Users/xx/Desktop/2.Rmd",
output_format = "pdf_document",
output_file = paste("test_report_", car, Sys.Date(), ".pdf", sep=''),
output_dir = "/Users/xx/Desktop")
}
.Rmd
```{r, include = FALSE}
# packages
library(knitr)
library(markdown)
library(rmarkdown)
library(tidyr)
library(dplyr)
library(ggplot2)
```
```{r}
# limit data to car name that is currently specified by the loop
cars <- mtcars[rownames(mtcars)==car,]
# create example data for each car
x <- sample(1:10, 1)
cars <- do.call("rbind", replicate(x, cars, simplify = FALSE))
# create hypotheical lat and lon for each row in cars
cars$lat <- sapply(rownames(cars), function(x) round(runif(1, 30, 46), 3))
cars$lon <- sapply(rownames(cars), function(x) round(runif(1, -115, -80),3))
cars
```
Today is `r Sys.Date()`.
```{r}
# data table of cars sold
table <- xtable(cars[,c(1:2, 12:13)])
print(table, type="latex", comment = FALSE)
```
This works fine. So I also looked at memory pressure while running my actual loop over the Rmd. which gets very high.
Is there a way to reduce memory when looping over a render call to an Rmd. file?
Is there a better way to create a report for multiple grouping factors than looping over a render call to an Rmd. file, which doesn't rely on the entire loop being inside one code chunk?
Found a solution here rmarkdown::render() in a loop - cannot allocate vector of size
knitr::knit_meta(class=NULL, clean = TRUE)
use this line before the render line and it seems to work
I am dealing with the same issue now and it's very perplexing. I tried to create some simple MWEs but they loop successfully on occasion. So far, I've tried
Checking the garbage collection between iterations of rmarkdown::render. (They don't reveal any special accumulations.)
Removing all inessential objects
Deleting any cached files manually
Here is my question:
How can we debug hangs? Should we set up special log files to understand what's going wrong?

Knitr output differs between Rmd and Rnw: data.table output example

Revised title to clarify focus.
We notice an anomaly that R markdown and data.table interact in a surprising way. Same does not happen when knitting LaTeX. Commands which not have a return within the R session do cause a return within the knitted markdown output. I trace the problem back to commands like the following, which do not produce an output in R,
````{r}
poolballs[ , weight2:=2 * weight]
```
but inside Rmarkdown, the output includes the full print of the poolballs DT. Same does not happen if we knit an equivalent chunk in LaTeX.
This produced some funny HTML output because I wrote chunks like this, intending to display only the first 5 lines
```{r}
poolballs[ , weight2:=2 * weight]
head(poolballs)
```
Markdown parses that as the equivalent of two chunks,
> poolballs[ , weight2:=2 * weight]
> poolballs
> head(poolballs)
Here's the markdown file to demonstrate
---
title: "Data Table Guide"
author:
- name: Paul Johnson
affiliation: Center for Research Methods and Data Analysis, University of Kansas
email: pauljohn#ku.edu
date: "`r format(Sys.time(), '%Y %B %d')`"
output:
html_document:
theme: united
highlight: haddock
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo=TRUE, comment=NA)
options(width = 70)
```
```{r make_pb_dt}
set.seed(234234)
library(data.table)
poolballs <- data.table(
number = 1:15,
weight = rnorm(15, 45.7, 0.8),
diameter = c(3, 2.9, 3.1) #shows recyling
)
poolballs
```
I want the following to show only head in line 2
```{r}
poolballs[ , weight2:=2 * weight]
head(poolballs)
```
Compare the HTML output:
http://pj.freefaculty.org/scraps/mre-dt.html
I'm sorry if this is a known feature of markdown. I've coded around this wrinkle by hiding chunks, but it seems somewhat inconvenient. Today i'm curious enough to ask you about it. I wrote the same chunks in a LaTeX file and the funny DT output problem does not happen. I put link to PDF from LaTeX in http:/pj.freefaculty.org/scraps/mre-dt-3.pdf
In your final chunk, knitr sees that you have two objects that its attempting to print and you're getting the output for both. This isn't a feature, and has been addressed in a previous question.
If you want to only print the head of the first object in that chunk, your code should be head(poolballs[, weight2:=2 * weight])

Resources