Using an R script to create a highly repetitive Rmarkdown script - r

I analyze survey results regularly and like to use Rmarkdown so I can make nice HTML output of the results.
The surveys can be many questions (like 40), so creating 40 code chunks, with highly repetitive code and headers, can be annoying. And I can easily do this with a loop in R, I think. However, I'm just stuck on how to combine these 2 processes!
This was close --
how to create a loop that includes both a code chunk and text with knitr in R
But in the end, it was just a loop, and not very flexible. So, I couldn't add a figure to question 22 (or whatever).
### Question 1
#### `r key$Question_Text[key$Question=="Q1"][1]`
```{r chunk1}
quest <- "Q1"
# code for question 1
```
### Question 2
#### `r key$Question_Text[key$Question=="Q2"][1]`
```{r chunk2}
quest <- "Q2"
# Identical code for question 2
```
....and so on....
### Question 35
#### `r key$Question_Text[key$Question=="Q35"][1]`
```{r chunk35}
quest <- "Q35"
# Identical code for question 35
```
Because sometimes, a question has a special type of figure or tweak, I want the output to be something I can paste into RMD and make all the changes there. I just want to skip ahead as much as possible... by making all the boring, identical steps, fully automated.

make strings that I can refer to in loop
question<-paste(rep("Question",20), 1:20, sep=" ")
qnum<-paste0(rep("Q",20), 1:20, sep="")
quest_text_code <- paste0("#### ","`r key$Question_Text[key$Question==", "\"",qnum[i],'"' ,"][1]`")
chunk <- paste(rep("chunk",20), 1:20, sep="")
use sink() to send to a text file
sink("outfile.txt")
loop and paste and output into the sink
for(i in 1:20){ cat(paste("###", question[i], "\n", "\n",
quest_text_code,"\n", "\n",
"```", "{r ", chunk[i], "}", "\n","\n",
"function.dat(", qnum[i], ")","\n", "\n",
"function.dat.nr(", qnum[i], ")","\n", "\n",
"```", "\n", "\n"))
}
dev.off() # ends sink
After that, I was able to copy to an RMD file and use find and replace on a few glitches (extra leading spaces) I also had trouble adding "" marks with paste.

Related

Is there an R script that write rmd files with the content from excel spreadsheets?

Solution in comments!
I'd like to have a template for the rmd file and have the template filled in with content, both R code chunks and regular text, according to the content of the spreadsheet.
Content to be used in code chunk
Text to be pasted after code chunk
Parametrized code 1
Description of event 1
Parametrized code 2
Description of event 2
Rmd output:
"""{r, echo = FALSE, comment = NA}
set.seed(rand_seed)
parametrized code 1
"""
Description of event 1
Thanks for your help!
I don't know if such a tool is available as of yet. However, you may be interested in taking a look at this topic https://bookdown.org/yihui/rmarkdown/parameterized-reports.html, from the R Markdown Definitive Guide.
EDIT:
Well, I came up with a simple solution. I'm not sure of how big your needs are for this logic, but for only a handful of chunks, maybe this script could help you. I tested it and it works.
instructions_set <- data.frame(
code_chunks <- c(
"a <- 50; print(a)",
"hist(iris$Sepal.Length)"
),
text_chunks <- c(
"I've just set the variable a to 50 and printed it.",
"This is a histogram of the variable Species in the Iris dataset."
)
)
file <- apply(instructions_set, MARGIN = 1, function(x) {
x[1] <- paste0("```{r}\n", x[1], "\n```")
return(
paste(x[1], x[2], "", sep = "\n")
)
})
readr::write_file(purrr::reduce(file, paste0), "test_file.Rmd")
I've found a solution using the knitrdata package. You can use the create_chunk and insert_chunk functions to convert strings to code chunks for rmarkdown.
My next step would be to load the strings into a data frame and iterate these two functions over the content of the data frame to create and insert multiple code chunks to the rmd file.
For the text part of the Rmd file, I'd use the writeLines function.
Here's a sample of the code chunk creation code I've put together:
library(knitrdata)
library(svDialogs)
#choose the blank template Rmd.file
input_file <- dlg_open(message = "Please select the blank Rmd file to use a template!" )$res
output_file <- paste(as.character(dlg_input( message = "Please name your output file!")$res), ".Rmd", sep = "")
#select line at which to insert the code chunk - I've chosen then last line of the file
insert_at <- length(readLines(input_file))
#takes string input and formats it as a chunk then returns it as a character vector
input_chunk <- create_chunk(text = "print('this is a test')", chunk_label = "Testing label creation", chunk_type = "r")
#inserts properly formatted chunk at the specified line of lines read from the target Rmd file
rmd_text <- insert_chunk(input_chunk, line = insert_at , rmd.file = input_file )
#since insert_chunk returns a character vector, this line writes it to a new Rmd file
writeLines(rmd_text,output_file)

RMarkdown: ggplot into a table

There are already a few questions considering ggplots in RMarkdown but none has answered my question as how to put a ggplot into a table with kable() by knitr.
I ve tried this link:
How can I embed a plot within a RMarkdown table?
But have not had any luck so far. Any ideas?
The idea was to put all plots into a list with
a<-list(p1,p2,p3...)
and then having the table with
{r}kable(a)
Additional text should also be able to be included
b<-("x","y","z",...)
kable (c(a,b),col.names=c())
Thanks for your help
Frieder
I experimented some with this and the following is the best I could come up with. This is a complete markdown document you should be able to paste into RStudio and hit the Knit button.
Two relevant notes here.
Setting the file links directly into kable doesn't work as it is wrapped in html such that it is interpreted as text, so we need to gsub() it in. An alternative is to set kable(..., escape = FALSE), but it is a risk that other text might cause problems.
Also, the chunk option results = 'asis' is necessary to have the print(kab) return raw html.
I don't know if these are problems for the real application.
---
title: "Untitled"
author: "me"
date: "02/06/2020"
output: html_document
---
```{r, results = 'asis'}
library(ggplot2)
library(svglite)
n <- length(unique(iris$Species))
data <- split(iris, iris$Species)
# Create list of plots
plots <- lapply(data, function(df) {
ggplot(df, aes(Sepal.Width, Sepal.Length)) +
geom_point()
})
# Create temporary files
tmpfiles <- replicate(n, tempfile(fileext = ".svg"))
# Save plots as files, get HTML links
links <- mapply(function(plot, file) {
# Suit exact dimensions to your needs
ggsave(file, plot, device = "svg", width = 4, height = 3)
paste0('<figure><img src="', file, '" style = "width:100%"></figure>')
}, plot = plots, file = tmpfiles)
# Table formatting
tab <- data.frame(name = names(plots), fig = paste0("dummy", LETTERS[seq_len(n)]))
kab <- knitr::kable(tab, "html")
# Substitute dummy column for figure links
for (i in seq_len(n)) {
kab <- gsub(paste0("dummy", LETTERS[i]), links[i], kab, fixed = TRUE)
}
print(kab)
```
I have found my way around it as described in the link I posted.
I. Saved my plot as a picture
II. Used sprintf() to insert picture into table with this command from Rmarkdown:
![](path/to/file)
Poor, but it works. If anybody finds a solution, I will always be interested in smart coding.

Import and manipulate R Scripts programmatically / Convert .R to .Rmd

I have an R Script that I would like to import from within a different R-script, manipulate it's content (search and replace) and save with a different extension (.rmd).
This is how the example.R File would look before manipulation:
# A title
# chunkstart
plot(1,1)
# chunkend
and this is how example.Rmd it would look after manipulation: replaced "# chunkstart" and "# chunkend" with ```{r} and ```, respectively.
# A title
```{r}
plot(1,1)
```
I've been searching for methods to do this, but so far have found none. Any ideas?
I'm sure that you can do it using regex with less lines of code.
However its should solve your problem.
library(magrittr)
readLines('example.R') %>%
stringr::str_replace("# chunkstart", "```{r}") %>%
stringr::str_replace("# chunkend", "```") %>%
writeLines("example.Rmd")
With the following lines of code you will be able to apply this "operation" in every .R file inside /path_to_some_directory
lapply(list.files('/path_to_some_directory', pattern = ".R$",
full.names = TRUE), function(data) {
readLines(data) %>%
stringr::str_replace("# chunkstart", "```{r}") %>%
stringr::str_replace("# chunkend", "```") %>%
writeLines(paste0(data, "md"))
})
Hope it helps!
I think ?knitr::spin is a relevant answer to the question (specifically asking for ideas), or at least a useful alternative to consider.
You'd have to slightly reformat the input, but the benefits would be a built-in, much richer and versatile way to deal with chunk options and formatting.
Here's what an annotated R script might look like (with spin's default regexs),
#' ## A title
#' first chunk
#- fig.width=10
plot(1,1)
# some text
#' another chunk
plot(2,2)
and the output Rmd reads,
## A title
first chunk
```{r fig.width=10}
plot(1,1)
# some text
```
another chunk
```{r }
plot(2,2)
```

rstudio hangs and aborts with rmarkdown loop

I have several datasets each of which have a common grouping factor. I want to produce one large report with separate sections for each grouping factor. Therefore I want to re-run a set of rmarkdown code for each iteration of the grouping factor.
Using the following approach from here doesnt work for me. i.e.:
---
title: "Untitled"
author: "Author"
output: html_document
---
```{r, results='asis'}
for (i in 1:2){
cat('\n')
cat("#This is a heading for ", i, "\n")
hist(cars[,i])
cat('\n')
}
```
Because the markdown I want to run on each grouping factor does not easily fit within one code chunk. The report must be ordered by grouping factor and I want to be able to come in and out of code chunks for each iteration over grouping factor.
So I went for calling an Rmd. with render using a loop from an Rscript for each grouping factor as found here:
# run a markdown file to summarise each one.
for(each_group in the_groups){
render("/Users/path/xx.Rmd",
output_format = "pdf_document",
output_file = paste0(each_group,"_report_", Sys.Date(),".pdf"),
output_dir = "/Users/path/folder")
}
My plan was to then combine the individual reports with pdftk. However, when I get to the about the 5th iteration my Rstudio session hangs and eventually aborts with a fatal error. I have ran individually the Rmd. for the grouping factors it stops at which work fine.
I tested some looping with the following simple test files:
.R
# load packages
library(knitr)
library(markdown)
library(rmarkdown)
# use first 5 rows of mtcars as example data
mtcars <- mtcars[1:5,]
# for each type of car in the data create a report
# these reports are saved in output_dir with the name specified by output_file
for (car in rep(unique(rownames(mtcars)), 100)){
# for pdf reports
rmarkdown::render(input = "/Users/xx/Desktop/2.Rmd",
output_format = "pdf_document",
output_file = paste("test_report_", car, Sys.Date(), ".pdf", sep=''),
output_dir = "/Users/xx/Desktop")
}
.Rmd
```{r, include = FALSE}
# packages
library(knitr)
library(markdown)
library(rmarkdown)
library(tidyr)
library(dplyr)
library(ggplot2)
```
```{r}
# limit data to car name that is currently specified by the loop
cars <- mtcars[rownames(mtcars)==car,]
# create example data for each car
x <- sample(1:10, 1)
cars <- do.call("rbind", replicate(x, cars, simplify = FALSE))
# create hypotheical lat and lon for each row in cars
cars$lat <- sapply(rownames(cars), function(x) round(runif(1, 30, 46), 3))
cars$lon <- sapply(rownames(cars), function(x) round(runif(1, -115, -80),3))
cars
```
Today is `r Sys.Date()`.
```{r}
# data table of cars sold
table <- xtable(cars[,c(1:2, 12:13)])
print(table, type="latex", comment = FALSE)
```
This works fine. So I also looked at memory pressure while running my actual loop over the Rmd. which gets very high.
Is there a way to reduce memory when looping over a render call to an Rmd. file?
Is there a better way to create a report for multiple grouping factors than looping over a render call to an Rmd. file, which doesn't rely on the entire loop being inside one code chunk?
Found a solution here rmarkdown::render() in a loop - cannot allocate vector of size
knitr::knit_meta(class=NULL, clean = TRUE)
use this line before the render line and it seems to work
I am dealing with the same issue now and it's very perplexing. I tried to create some simple MWEs but they loop successfully on occasion. So far, I've tried
Checking the garbage collection between iterations of rmarkdown::render. (They don't reveal any special accumulations.)
Removing all inessential objects
Deleting any cached files manually
Here is my question:
How can we debug hangs? Should we set up special log files to understand what's going wrong?

Create a loop that includes both a code chunk and text

I am trying to figure out how to create a loop that inserts some text into the rmarkdown file, and then produces the graph or table that corresponds to that header. The following is how I picture it working:
for(i in 1:max(month)){
### `r month.name[i]` Air quaility
```{r, echo=FALSE}
plot(airquality[airquality$Month == 5,])
```
}
This ofcourse just prints the for loop as text, if i surround the for loop with r`` I would just get an error.
I want the code to produce an rmd file that looks like this:
May Air Quality
Plot
June Air Quality
Plot
and so on and so forth.
Any ideas? I cannot use latex because I at my work they do not let us download exe files, and I do not know how to use latex anyways. I want to produce a word document.
You can embed the markdown inside the loop using cat().
Note: you will need to set results="asis" for the text to be rendered as markdown.
Note well: you will need two spaces in front of the \n new line character to get knitr to properly render the markdown in the presence of a plot out.
# Monthly Air Quality Graphs
```{r pressure,fig.width=6,echo=FALSE,message=FALSE,results="asis"}
attach(airquality)
for(i in unique(Month)) {
cat(" \n###", month.name[i], "Air Quaility \n")
#print(plot(airquality[airquality$Month == i,]))
plot(airquality[airquality$Month == i,])
cat(" \n")
}
```
As mentioned here, you could also make use of the pander package:
# Monthly Air Quality Graphs
```{r pressure2, fig.width=6, echo=FALSE, message=FALSE, results="asis"}
library(pander)
for (i in unique(airquality$Month)) {
# Inserts Month titles
pander::pandoc.header(month.name[i], level = 3)
# Section contents
plot(airquality[airquality$Month == i,])
# adding also empty lines, to be sure that this is valid Markdown
pander::pandoc.p('')
pander::pandoc.p('')
}
```
Under some conditions I find it helpful to write a loop that writes chunk code rather than write a chunk that runs a loop. Weird solution but it has worked for me beautifully in the past when a bare bones set of chunks is all I need. For your airquality case it would look like this:
## model chunk ##
# ## May Air Quality
# ```{r May}
#
# plot(airquality[airquality$Month == 5,])
#
# ```
# all months in airquality
aqmonths <- c("May",
"June",
"July",
"August",
"September")
for (m in aqmonths) {
cat(
paste0(
"## ", m, " Air Quality",
"\n\n",
"```{r ", m, "}",
"\n\n",
"plot(airquality[airquality$Month == ", match(m, months), ",])",
"\n\n",
"```",
"\n\n"
)
)
}
This will print code for all 5 chunks to the console, and then I can copy and paste into a .Rmd document. It is possible to include any chunk options such as captions or fig arguments in the chunk-writing loop as well. Depending on what else you try to bring in, using functions like match() as in the example is often helpful.
Pros: Preserves ability to use cross-references and set individual captions or options.
Cons: Making changes to all chunks usually requires re-copying the entire output of the chunk-writing loop, which can be tiresome and a bit unwieldy.
What about reusing the chunks inside a loop using <<label>> as described here: https://bookdown.org/yihui/rmarkdown-cookbook/reuse-chunks.html
Label your chunk, set eval=F
```{r my_chunk, echo=FALSE, eval=F}
plot(airquality[airquality$Month == 5,])
```
Then loop
for(i in 1:max(month)){
<<my_chunk>>
}

Resources