R knitr: Possible to programmatically modify chunk labels? - r

I'm trying to use knitr to generate a report that performs the same set of analyses on different subsets of a data set. The project contains two Rmd files: the first file is a master document that sets up the workspace and the document, the second file only contains chunks that perform the analyses and generates associated figures.
What I would like to do is knit the master file, which would then call the second file for each data subset and include the results in a single document. Below is a simple example.
Master document:
# My report
```{r}
library(iterators)
data(mtcars)
```
```{r create-iterator}
cyl.i <- iter(unique(mtcars$cyl))
```
## Generate report for each level of cylinder variable
```{r cyl4-report, child='analysis-template.Rmd'}
```
```{r cyl6-report, child='analysis-template.Rmd'}
```
```{r cyl8-report, child='analysis-template.Rmd'}
```
analysis-template.Rmd:
```{r, results='asis'}
cur.cyl <- nextElem(cyl.i)
cat("###", cur.cyl)
```
```{r mpg-histogram}
hist(mtcars$mpg[mtcars$cyl == cur.cyl], main = paste(cur.cyl, "cylinders"))
```
```{r weight-histogam}
hist(mtcars$wt[mtcars$cyl == cur.cyl], main = paste(cur.cyl, "cylinders"))
```
The problem is knitr does not allow for non-unique chunk labels, so knitting fails when analysis-template.Rmd is called the second time. This problem could be avoided by leaving the chunks unnamed since unique labels would then be automatically generated. This isn't ideal, however, because I'd like to use the chunk labels to create informative filenames for the exported plots.
A potential solution would be using a simple function that appends the current cylinder to the chunk label:
```r{paste('cur-label', cyl, sep = "-")}
```
But it doesn't appear that knitr will evaluate an expression in the chunk label position.
I also tried using a custom chunk hook that modified the current chunk's label:
knit_hooks$set(cyl.suffix = function(before, options, envir) {
if (before) options$label <- "new-label"
})
But changing the chunk label didn't affect the filenames for generated plots, so I didn't think knitr was utilizing the new label.
Any ideas on how to change chunk labels so the same child document can be called multiple times? Or perhaps an alternative strategy to accomplish this?

For anyone else who comes across this post, I wanted to point out that #Yihui has provided a formal solution to this question in knitr 1.0 with the introduction of the knit_expand() function. It works great and has really simplified my workflow.
For example, the following will process the template script below for every level of mtcars$cyl, each time replacing all instances of {{ncyl}} (in the template) with its current value:
# My report
```{r}
data(mtcars)
cyl.levels <- unique(mtcars$cyl)
```
## Generate report for each level of cylinder variable
```{r, include=FALSE}
src <- lapply(cyl.levels, function(ncyl) knit_expand(file = "template.Rmd"))
```
`r knit(text = unlist(src))`
Template:
```{r, results='asis'}
cat("### {{ncyl}} cylinders")
```
```{r mpg-histogram-{{ncyl}}cyl}
hist(mtcars$mpg[mtcars$cyl == {{ncyl}}],
main = paste({{ncyl}}, "cylinders"))
```
```{r weight-histogam-{{ncyl}}cyl}
hist(mtcars$wt[mtcars$cyl == {{ncyl}}],
main = paste({{ncyl}}, "cylinders"))
```

If you make all chunks in your ** nameless, i.e. ```{r} it works. This, of course, is not very elegant, but there are two issues preventing you from changing the label of the current chunk:
A file is parsed before the code blocks are executed. The parser already detects duplicate labels, before any code is executed or custom hooks are called.
The chunk options (inc. the label) are processed before the hook is called (logical: it's an option that triggers a hook), so the hook cannot change the label anymore.
The fact that unnamed blocks work is that internally they get the label unnamed-chunk-+chunk number.
Blocks cannot have duplicate names as internally knitr references them by label. A fix could be to make knitr add the chunk number to all chunks with duplicate names. Or to reference them by chunk number instead of label, but that seems to me a much bigger change.

There is a similar question posed here I was able to programmatically create r chunks and knit the outputs for use in a flexdashboard (quite useful) based on an arbitrary list of input plots using the knit_expand(text=) and r paste(knitr::knit(text = paste(out, collapse = '\n'))) methods.

Related

How to show code but hide output in RMarkdown?

I want my html file to show the code, but not the output of this chunk:
```{r echo=True, include=FALSE}
fun <- function(b)
{
for(a in b)
{print(a)
return(a * a)}
}
y <- fun(b)
```
When I run the code, i need the print to see the progress (it is quite a long function in reality).
But in the knitr file, I use the output in a further chunk, so I do not want to see it in this one (and there's no notion of progress, since the code has already been run).
This echo=True, include=FALSE here does not work: the whole thing is hidden (which is the normal behavior of include=FALSE).
What are the parameters I could use to hide the prints, but show my code?
As # J_F answered in the comments, using {r echo = T, results = 'hide'}.
I wanted to expand on their answer - there are great resources you can access to determine all possible options for your chunk and output display - I keep a printed copy at my desk!
You can find them either on the RStudio Website under Cheatsheets (look for the R Markdown cheatsheet and R Markdown Reference Guide) or, in RStudio, navigate to the "Help" tab, choose "Cheatsheets", and look for the same documents there.
Finally to set default chunk options, you can run (in your first chunk) something like the following code if you want most chunks to have the same behavior:
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = T,
results = "hide")
```
Later, you can modify the behavior of individual chunks like this, which will replace the default value for just the results option.
```{r analysis, results="markup"}
# code here
```
The results = 'hide' option doesn't prevent other messages to be printed.
To hide them, the following options are useful:
{r, error=FALSE}
{r, warning=FALSE}
{r, message=FALSE}
In every case, the corresponding warning, error or message will be printed to the console instead.
```{r eval=FALSE}
The document will display the code by default but will prevent the code block from being executed, and thus will also not display any results.
For muting library("name_of_library") codes, meanly just showing the codes, {r loadlib, echo=T, results='hide', message=F, warning=F} is great. And imho a better way than library(package, warn.conflicts=F, quietly=T)
For completely silencing the output, here what works for me
```{r error=FALSE, warning=FALSE, message=FALSE}
invisible({capture.output({
# Your code here
2 * 2
# etc etc
})})
```
The 5 measures used above are
error = FALSE
warning = FALSE
message = FALSE
invisible()
capture.output()
To hide warnings, you can also do
{r, warning=FALSE}

knitr and knit_print with HTML output - dispatch not working

I have a shiny app (radiant.data) that uses knitr to generate reports viewable inside the application (R > Report). Because the output is displayed inside a shiny app I need a render function for something like a DT table to be displayed (i.e., convert to shiny.render.function). This all works fine.
Now, however, I want to use a custom print method to handle rendering so I can just use DT::datatable(mtcars) with knitr and knit_print.datatables to generate a shiny.render.function.
Below some example R-markdown that includes the knit_print.datatables function i'm using. chunk1 and chunk2 show a shiny.render.function as intended but the chunk6 shows nothing. If screenshot.force = TRUE I get a screenshot from chunk6 but that is not what I want either.
Example R-markdown to be processed with knitr::knit2html
```{r chunk1}
knitr::opts_chunk$set(screenshot.force = FALSE)
DT::renderDataTable(DT::datatable(mtcars))
```
```{r chunk2}
knit_print.datatables <- function(object, ...) {
DT::renderDataTable(object)
}
```
```{r chunk3}
knit_print <- knitr::knit_print
knit_print.datatables(DT::datatable(mtcars))
```
```{r chunk4}
getS3method("knit_print", "datatables")
```
```{r chunk5}
class(DT::datatable(mtcars))
```
```{r chunk6}
DT::datatable(mtcars)
```
I realize the R-markdown above, although reproducible if you have knitr and DT installed, looks a bit weird but when I export the knit_print.datatables function properly I get the same result in my application (see example output below).

Data and plots from different chunks

I am creating a Word report with R studio, markdown and knitr and I am having some troubles.
My r code includes several chunks, becauase between chunks, I want to include the text my report should include.
The problem I have is that: if use a single chunk, then the report is ok, but I can't include text/comments to be written in the report, unless I print also the code (right?). But if I use multiple chunks, then, when compiling, plots are not included in the report and warning messages appear:
pandoc.exe: Could not find image `Scriptv01_files/figure-docx/4.PLOTS-1.png', skipping...
It only works with HTML output: report includes all plots, but not with DOC nor PDF output.
I think the issue is that the data object is created in a different chunk, but I have tried 'cache' and 'autodep' options with no success.
How can this be done? What's the problem with the code?
Many thanks!
Here I provide a code example:
---
output: word_document
---
# PROJECT: IRIS STUDY
#### Statistical Analysis
```{r setup}
require(knitr)
opts_chunk$set(echo = TRUE, message=FALSE, warning=FALSE, comment='')
```
```{r read data}
dataset<-iris
```
### Data Descriptive by Iris Specie
```{r 4. ANALYSE DATA - DATA DESCRIPTION BY SPECIE}
require(ggplot2)
ggplot(dataset, aes(Species)) + geom_bar(aes(fill=Species))+
labs(x = "Species", y = "Number of Flowers")+ ggtitle("Fisher's Iris data set")
```
knitr uses the chunk name as part of the image file name. The chunk name 4. ANALYSE DATA - DATA DESCRIPTION BY SPECIE is invalid and is the reason why the plot is not being created. Replacing the name by a valid name solves the problem:
Avoid spaces and periods . in chunk labels and directory names [Source]

Loop in R markdown

I have an R markdown document like this:
The following graph shows a histogram of variable x:
```{r}
hist(x)
```
I want to introduce a loop, so I can do the same thing for multiple variables. Something hypothetically like this:
for i in length(somelist) {
output paste("The following graph shows a histogram of somelist[[" , i, "]]")
```{r}
hist(somelist[[i]])
```
Is that even possible?
PS: The greater plan is to create a program that would go over a data frame and automatically generates appropriate summaries for each column (e.g. histogram, tables, box plots, etc). The program then can be used to automatically generate a markdown document that contains the exploratory analysis you would do when seeing a data for the first data.
Could that be what you want?
---
title: "Untitled"
author: "Author"
output: html_document
---
```{r, results='asis'}
for (i in 1:2){
cat('\n')
cat("#This is a heading for ", i, "\n")
hist(cars[,i])
cat('\n')
}
```
This answer was more or less stolen from here.
As already mentioned, any loop needs to be in a code chunk. It might be easier to to give the histogram a title rather than add a line of text as a header for each one.
```{r}
for i in length(somelist) {
title <- paste("The following graph shows a histogram of", somelist[[ i ]])
hist(somelist[[i]], main=title)
}
```
However, if you would like to create multiple reports then check out this thread.
Which also has a link to this example.
It seems when the render call is made from within a script, the environmental variables can be passed to the Rmd file.
So an alternative might be to have your R script:
for i in length(somelist) {
rmarkdown::render('./hist_.Rmd', # file 2
output_file = paste("hist", i, ".html", sep=''),
output_dir = './outputs/')
}
And then your Rmd chunk would look like:
```{r}
hist(i)
```
Disclaimer: I haven't tested this.

Defer code to END of document in knitr

I am trying to write a report in rmarkdown and then use knitr to generate a pdf.
I want all the code to be pushed to the "End of the document", while just displaying results interweaved with my text. The echo='hold' option doesn't do this.
Section of my markdown file
Generate data
```{r chunk1,echo='hold',R.options=}
num_seq<-rnorm(100,0.2)
num_seq
```
We further report the mean of these numbers.
```{r,echo='hold' }
mean(num_seq)
```
I have tried to read the the relevant documentation found here http://yihui.name/knitr/options/, but I can't figure out how to do this.
I don't think echo='hold' is an option. Regardless, the trick is to use echo=FALSE where the code is included, and then re-use the same chunk name and use eval=FALSE where you want the code to be printed. (Other options in both locations are fine, but these two are the minimum required.)
The following evaluates the code (and optionally includes output from it) where the chunk is located, but doesn't include the code until you specify.
# Header 1
```{r chunk1, echo=FALSE}
x <- 1
x + 5
```
This is a test.
```{r chunk1, eval=FALSE}
```
Results in the following markdown:
Header 1
========
## [1] 6
This is a test.
x <- 1
x + 5
Edit: I use this frequently in R markdown documents with randomness: I store the random seed in the very beginning (whether I set it manually or just store the current random state for later reproduction) and display it in an annex/appendix:
# Header 1
```{r setseed, echo=FALSE, include=FALSE}
set.seed(seed <- sample(.Machine$integer.max, size=1))
seed
```
This is a test `r seed`.
# Annex A {-}
```{r showsetseed, ref.label='setseed', eval=FALSE}
```
```{r printseed, echo=FALSE}
seed
```
This example doesn't include the results with the original code chunk. Unfortunately, the results aren't stored, and if I set eval=TRUE when I use the same chunk name later, it will calculate and present a different seed. That's why the printseed block. The reason I explicitly "show" seed in the first setseed block is solely so that, in the annex, the showsetseed and printseed chunks flow well. (Otherwise, set.seed does not return a number, so it would have looked wierd.)
BTW: this second example uses ref.label, which Yihui documents here as a more general approach to chunk reuse.
BTW #2: when I said "store the random state", that's not completely correct ... I'm storing a randomly-generated seed. The random state itself is much larger than a single integer, of course. I don't want to anger the PRNG gods :-)

Resources