A Way in Knitr to Copy a Chunk? - r

Knitr Mavens,
Background: Using knitr to report a report with many embedded graphs. In the body of the report, all that's appropriate is the graph, not the code.
For example:
```{r graph_XYZ_subset, echo = FALSE, message = TRUE,
fig.cap = "Text that explains the graph"}
graph.subset <- ggplot() + ...
```
This part works just fine.
However, there is a need to display the key parts of the code (e.g., key statistical analyses and key graph generations)...but in an Addendum.
Which leads to this question: is there a way to copy a knitr chunk from the early parts of a script to a later part?
To ensure accuracy, it's ideal that the code in the Addendum list (display) all the code that was actually executed in the report.
For example:
# ADDENDUM - Code Snippets
### Code to Generate Subset Graph
\\SOMEHOW COPY the code from graph_XYZ_subset to here without executing it.
### Code to Compute the Mean of Means of the NN Factors
\\Copy another knitr chunk which computes the mean of means, etc.
### And So On...
\\Copy chunks till done
* * * * * * * *
Any ideas? Is there a way in knitr to perform these types of chunk copies?

There are several options, four of them listet and shortly explained below. Yihui's explanations in How to reuse chunks might also help.
\documentclass{article}
\begin{document}
\section{Output}
<<mychunk, echo = FALSE>>=
print("Hello World!")
#
\section{Source code}
Option 1: Use an empty chunk with the same label.
<<mychunk, eval = FALSE>>=
#
Option 2: Embed in other chunk (no advantage in this case). Note that there is no equality sign and no at for the inner chunk.
<<myOtherChunk, eval = FALSE>>=
<<mychunk>>
#
Option 3: Use \texttt{ref.label}.
<<ref.label = "mychunk", eval = FALSE>>=
#
Option 4: Define the chunk in an external file you read in using \texttt{read\_chunk}. Then use Option 1--3 to execute the chunk (with \texttt{eval = TRUE}; default) or show it's code (with \texttt{eval = FALSE}).
\end{document}
I usually prefer Option 4. This allows you to separate the programming logic from writing the document.
At the place mychunk is to be exectued and the graph will appear in the PDF, you only have <<mychunk>>= in your Rnw file and don't have to bother with all the code that generates your graph. Developing your code is also easier, because in an interactive session you have all your code at one spot and don't have to scroll through all the text of the report when going from one chunk to the next one.
EDIT:
The options mentioned above have in common that you need to manually maintain a list of the chunks to show in the appendix. Here two options to avoid this; unfortunately, both have some drawbacks:
Option 1: Automatically create a list of all chunks that have been executed and show their code.
This can be achieved using a chunk hook that registers all chunk names. Include the following chunk before all other chunks in the document:
<<echo = FALSE>>=
library(knitr)
myChunkList <- c()
listChunks <- function(before, options, envir) {
if (before) {
myChunkList <<- c(myChunkList, options$label)
}
return(NULL)
}
knit_hooks$set(recordLabel = listChunks) # register the new hook
opts_chunk$set(recordLabel = TRUE) # enable the new hook by default
#
Where you want to show the code (for example in the appendix), insert the following chunk:
<<showCode, ref.label = unique(myChunkList), eval = FALSE>>=
#
Unfortunately, there will be no margin or any other visual separation between the chunks.
Option 2: Using the chunk hook is not always necessary because there is the function all_labels() that returns a list of all chunk labels. However, there might be chunks in your file that don't get executed and you probably don't want to see their code. Moreover, option 1 allows skipping certain chunks simply by setting recordLabel = FALSE in their chunk options.

Related

How to Customize figure LaTeX with knitr/rmarkdown

I want to create a footer within the float for a figure created with ggplot2 in an rmarkdown document that is generating a .pdf file via LaTeX.
My question: Is there a way within rmarkdown/knitr to add more LaTeX commands within the figure environment?
Specifically, I'd like to find a way to insert custom LaTeX using either the floatrow or caption* macro as described in https://tex.stackexchange.com/questions/56529/note-below-figure within the figure environment.
When I looked at the chunk options (https://yihui.org/knitr/options/#plots), something like out.extra seems close to what I want, but that is used as an extra option to \includegraphics while I want access to put extra LaTeX within the figure environment, outside of any other LaTeX command.
The solution to your question is perhaps quite similar to this one. However, I believe yours is a bit more general, so I'll try to be a bit more general as well...
As far as I know, there's no simple solution to add extra LaTeX code within the figure environment. What you can do is update the knit (or output) hook (i.e. the LaTeX code output generated by the figure chunk).
The source code for the LaTeX figure output hook can be found here (hook_plot_tex). The output generated can be found starting at line 159. Here we can see how the output is structured and we're able to modify it before it reaches the latex engine.
However, we only want to modify it for relevant figure chunks, not all. This is where Martin Schmelzer's answer comes in handy. We can create a new chunk option which allows for control over when it is activated. As an example enabling the use of caption* and floatrow we can define the following knit hook
defOut <- knitr::knit_hooks$get("plot")
knitr::knit_hooks$set(plot = function(x, options) {
#reuse the default knit_hook which will be updated further down
x <- defOut(x, options)
#Make sure the modifications only take place when we enable the customplot option
if(!is.null(options$customplot)) {
x <- gsub("caption", "caption*", x) #(1)
inter <- sprintf("\\floatfoot{%s}\\end{figure}", options$customplot[1]) #(2)
x <- gsub("\\end{figure}", inter, x, fixed=T) #(3)
}
return(x)
})
What we're doing here is (1) replacing the \caption command with \caption*, (2) defining the custom floatfoot text input, (3) replacing \end{figure} with \floatfoot{custom text here}\end{figure} such that floatfoot is inside the figure environment.
As you can probably tell, sky's the limit for what you can add/replace in the figure environment. Just make sure it is added inside the environment and is in the apropriate location. See the example below how the chunk option is used to enable floatfoot and caption*. (You can also split the customplot option into e.g. starcaption and floatfoot by simply dividing up the !is.null(options$customplot) condition. This should allow for better control)
Working example:
---
header-includes:
- \usepackage[capposition=top]{floatrow}
- \usepackage{caption}
output: pdf_document
---
```{r, echo = F}
library(ggplot2)
defOut <- knitr::knit_hooks$get("plot")
knitr::knit_hooks$set(plot = function(x, options) {
x <- defOut(x, options)
if(!is.null(options$customplot)) {
x <- gsub("caption", "caption*", x)
inter <- sprintf("\\floatfoot{%s}\\end{figure}", options$customplot[1])
x <- gsub("\\end{figure}", inter, x, fixed=T)
}
return(x)
})
```
```{r echo = F, fig.cap = "Custom LaTeX hook chunk figure", fig.align="center", customplot = list("This is float text using floatfoot and floatrow")}
ggplot(data = iris, aes(x=Sepal.Length, y=Sepal.Width))+
geom_point()
```
PS
The example above requires the fig.align option to be enabled. Should be fairly easy to fix, but I didn't have the time to look into it.
#henrik_ibsen gave the answer that got me here. I made some modifications to the code that I ended up using to make it work a bit more simply:
hook_plot_tex_footer <- function(x, options) {
x_out <- knitr:::hook_plot_tex(x, options)
if(!is.null(options$fig.footer)) {
inter <- sprintf("\\floatfoot{%s}\n\\end{figure}", options$fig.footer[1])
x_out <- gsub(x=x_out, pattern="\n\\end{figure}", replacement=inter, fixed=TRUE)
}
x_out
}
knitr::knit_hooks$set(plot=hook_plot_tex_footer)
Overall, this does the same thing. But, it uses knitr:::hook_plot_tex() instead of defOut() so that if rerun in the same session, it will still work. And, since it's going into a \floatfoot specifically, I named the option fig.footer. But, those changes are minor and the credit for the answer definitely goes to #henrik_ibsen.

Using R/Markdown fails inside learnr question

Motivation: I want to write an interface that uses questions from the R package exams in learnr questions/quizzes. In R/exams each question is either an R/Markdown (Rmd) or R/LaTeX (Rnw) file with a certain structure specifying question, solution, and further meta-information. The questions can contain R code to make them dynamic, e.g., sampling numbers or certain text building blocks etc. Hence, the workflow is that first the questions are run through knitr::knit or utils::Sweave and then embedded in a suitable output format.
Problem: When I rmarkdown::run("learnr+rexams.Rmd") a learnr tutorial that dynamically produces a question or quiz from an Rmd exercise I get the error:
Error in if (grepl(not_valid_char_regex, label)) { :
argument is of length zero
The code for a simple reproducible example learnr+rexams.Rmd is included below.
The reason for the error appears to be that learnr runs a function verify_tutorial_chunk_label() that tries to assure the the learnr R chunk labels are well formatted. However, confusion is caused by the chunks that are run by the R/exams package, unnecessarily leading to the error above.
Workarounds: I can disable the verify_tutorial_chunk_label() in the learnr namespace and then everything works well. Or I can use Rnw instead of Rmd exercises and then learnr does not conflict with Sweave(). Also, when I run my code outside of a learnr tutorial it works fine.
Question: Can I do anything less invasive to make exams cooperate with learnr? For example, setting some appropriate knitr options or something like that?
Example: This is the source for the minimal learnr tutorial learnr+rexams.Rmd that replicates the problem. Note that everything is very much simplified and only works for certain R/exams exercises, here using the function exercise template that ships with R/exams.
---
title: "learnr & R/exams"
output: learnr::tutorial
runtime: shiny_prerendered
---
```{r exams2learnr, include = FALSE}
exams2learnr <- function(file) {
x <- exams::xexams(file)[[1]][[1]]
x <- list(text = x$question, type = "learnr_text",
learnr::answer(x$metainfo$solution, correct = TRUE))
do.call(learnr::question, x)
}
## assignInNamespace("verify_tutorial_chunk_label", function() return(), ns = "learnr")
```
```{r rfunctions, echo = FALSE, message = FALSE}
exams2learnr("function.Rmd")
```
Running this tutorial (as noted above) replicates the error. To avoid it I can either uncomment the assignInNamespace() call or alternatively replace "function.Rmd" by "function.Rnw".
The problem is that by the time learnr::question() is called, knitr is no longer able to find the chunk label for the chunk where exams2learnr() was called. You can get around this by setting the current chunk label before calling do.call(learnr_question, x):
exams2learnr <- function(file, label = knitr::opts_current$get("label")) {
force(label)
x <- exams::xexams(file)[[1]][[1]]
x <- list(
text = x$question,
type = "learnr_text",
learnr::answer(x$metainfo$solution, correct = TRUE)
)
knitr::opts_current$set(label = label)
do.call(learnr::question, x)
}
This also lets you set the label dynamically if you want, which becomes the ID of the question in learnr.

kable all tables in markdown report [duplicate]

Using knitr and R Markdown, I can produce a tabularised output from a matrix using the following command:
```{r results='asis'}
kable(head(x))
```
However, I’m searching for a way to make the kable code implicit since I don’t want to clutter the echoed code with it. Essentially, I want this:
```{r table=TRUE}
head(x)
```
… to yield a formatted tabular (rather than the normal output='markdown') output.
I actually thought this must be pretty straightforward since it’s a pretty obvious requirement, but I cannot find any way to achieve this, either via the documentation or on the web.
My approach to create an output hook fails because once the data arrives at the hook, it’s already formatted and no longer the raw data. Even when specifying results='asis', the hook obtains the output as a character string and not as a matrix. Here’s what I’ve tried:
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (! is.null(options$table))
kable(x)
else
default_output_hook(x, options)
)
But like I said, this fails since x is not the original matrix but rather a character string, and it doesn’t matter which value for the results option I specify.
Nowadays one can set df_print in the YAML header:
---
output:
html_document:
df_print: kable
---
```{r}
head(iris)
```
I think other answers are from a time when the following didn't work, but now we can just do :
```{r results='asis', render=pander::pander}
head(x)
```
Or set this for all chunks in the setup chunk, for instance :
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, render=pander::pander)
```
Lacking a better solution I’m currently re-parsing the character string representation that I receive in the hook. I’m posting it here since it kind of works. However, parsing a data frame’s string representation is never perfect. I haven’t tried the following with anything but my own data and I fully expect it to break on some common use-cases.
reparse <- function (data, comment, ...) {
# Remove leading comments
data <- gsub(sprintf('(^|\n)%s ', comment), '\\1', data)
# Read into data frame
read.table(text = data, header = TRUE, ...)
}
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (is.null(options$table))
default_output_hook(x, options)
else {
extra_opts <- if (is.list(options$table)) options$table else list()
paste(kable(do.call(reparse, c(x, options$comment, extra_opts))),
collapse = '\n')
}
)
This will break if the R markdown comment option is set to a character sequence containing a regular expression special char (e.g. *), because R doesn’t seem to have an obvious means of escaping a regular expression.
Here’s a usage example:
```{r table=TRUE}
data.frame(A=1:3, B=4:6)
```
You can pass extra arguments to the deparse function. This is necessary e.g. when the table contains NA values because read.table by default interprets them as strings:
```{r table=list(colClasses=c('numeric', 'numeric'))}
data.frame(A=c(1, 2, NA, 3), B=c(4:6, NA))
```
Far from perfect, but at least it works (for many cases).
Not exactly what you are looking for, but I am posting an answer here (that could not fit in a comment) as your described workflow is really similar to what my initial goal and use-case was when I started to work on my pander package. Although I really like the bunch of chunk options that are available in knitr, I wanted to have an engine that makes creating documents really easy, automatic and without any needed tweaks. I am aware of the fact that knitr hooks are really powerful, but I just wanted to set a few things in my Rprofile and let the literate programming tool its job without further trouble, that ended to be Pandoc.brew for me.
The main idea is to specify a few options (what markdown flavour are you using, what's your decimal mark, favorite colors for your charts etc), then simply write your report in a brew syntax without any chunk options, and the results of your code would be automatically transformed to markdown. Then convert that to pdf/docx/odt etc. with Pandoc.

Include same chunk twice with different paramters

I have a long .Rnw document which consists mostly of text (typeset in LaTeX) with a few chunks here and there. I have also written a chunk which outputs a specific figure. The figure contains a plot, the values for the plot are currently read from a .csv file and some parameters like colors defined manually within the chunk.
Now I want to have the same figure in a different place in the document, but with different values for the plot and a few other parameters different. Ideally, I would like to include the chunk as a child twice, and pass parameters to it somehow, including the name of the .csv to be used for the plot values. I would hate to copy paste the chunk code with hardcoded parameters, as it is complex enough that potential changes will be difficult to synchronize.
How can I do such "parameterized reuse" of chunks?
update
As requested, a small example
This is saved as include-chunk-reuse.Rnw
<<toReuse, echo=FALSE, result='asis'>>=
l <- 25
#
\newlength{\mylength}
\setlength{\mylength}{\Sexpr{l}pt}
%Omitted: a lot of complicated LaTeX commands
\rule{\mylength}{1pt}
This is the document which is supposed to reuse the chunk. It doesn't even compile, as it complains that the same label is used twice: Error in parse_block(g[-1], g[1], params.src) : duplicate label 'toReuse'
\documentclass{article}
\begin{document}
This is some text. And now comes a 25 pt wide line.
<<first-figure, child='include-chunk-reuse.Rnw'>>=
#
This is some text. The next line is also 25 pt wide. But I would like to call the chunk in a way which makes it 50 pt wide instead.
<<second-figure, child='include-chunk-reuse.Rnw'>>=
#
\end{document}
For the knitr part to work simply leave out the chunk-name in the child document, then you don't have the duplicated label and the knitr part works.
Passing Parameters does not really work as far as I know, but you can just set a global variable before including the child. (For example \Sexpr{l <- 200}
You are still redefining \mylength which is why LaTeX will throw an error, so move the first definition of \mylength from the child to the main document.
The example below demonstrates two ways to reuse and parametrize a chunk.
Reusing Chunks
The mechanism is explained here. Basically, the simplest way to reuse a chunk is to add another empty chunk with the same label. Alternatively, the chunk option ref.label lets a chunk inherit another chunks code.
Both approaches of reusing chunks are largely equivalent – with one exception: figures generated in chunks are saved as chunklabel-i.pdf, where i is the figure index counted by chunk. Therefore, if a chunk is reused by repeating its label, figure i from the second use will overwrite figure i from the first use. This is the reason why I use ref.label (and thus distinct chunk labels) in the example below (otherwise, the points on both plots would be green).
In the example below, I used eval = FALSE in order to prevent evaluation of the masterchunk where it is defined. An alternative would be to externalize the chunk and read it by read_chunk().
Parameterizing Chunks
The two most straightforward options to "pass" parameters to a chunk are
chunk options and
global variables
Also when reusing chunks, each use can set different chunk options. The example below exploits this to set different captions.
As all chunks run in the same environment, setting a variable in an early chunk affects subsequent chunks accessing this variable. In the example below, mycolor is modified this way.
\documentclass{article}
\begin{document}
<<masterchunk, eval = FALSE>>=
plot(1:10, col = mycolor)
#
<<config1>>=
mycolor <- "red"
#
<<use1, ref.label = "masterchunk", fig.cap = "Red dots">>=
#
<<config2>>=
mycolor <- "green"
#
<<use2, ref.label = "masterchunk", fig.cap = "Green dots">>=
#
\end{document}

Use hooks to format table in output

Using knitr and R Markdown, I can produce a tabularised output from a matrix using the following command:
```{r results='asis'}
kable(head(x))
```
However, I’m searching for a way to make the kable code implicit since I don’t want to clutter the echoed code with it. Essentially, I want this:
```{r table=TRUE}
head(x)
```
… to yield a formatted tabular (rather than the normal output='markdown') output.
I actually thought this must be pretty straightforward since it’s a pretty obvious requirement, but I cannot find any way to achieve this, either via the documentation or on the web.
My approach to create an output hook fails because once the data arrives at the hook, it’s already formatted and no longer the raw data. Even when specifying results='asis', the hook obtains the output as a character string and not as a matrix. Here’s what I’ve tried:
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (! is.null(options$table))
kable(x)
else
default_output_hook(x, options)
)
But like I said, this fails since x is not the original matrix but rather a character string, and it doesn’t matter which value for the results option I specify.
Nowadays one can set df_print in the YAML header:
---
output:
html_document:
df_print: kable
---
```{r}
head(iris)
```
I think other answers are from a time when the following didn't work, but now we can just do :
```{r results='asis', render=pander::pander}
head(x)
```
Or set this for all chunks in the setup chunk, for instance :
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, render=pander::pander)
```
Lacking a better solution I’m currently re-parsing the character string representation that I receive in the hook. I’m posting it here since it kind of works. However, parsing a data frame’s string representation is never perfect. I haven’t tried the following with anything but my own data and I fully expect it to break on some common use-cases.
reparse <- function (data, comment, ...) {
# Remove leading comments
data <- gsub(sprintf('(^|\n)%s ', comment), '\\1', data)
# Read into data frame
read.table(text = data, header = TRUE, ...)
}
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (is.null(options$table))
default_output_hook(x, options)
else {
extra_opts <- if (is.list(options$table)) options$table else list()
paste(kable(do.call(reparse, c(x, options$comment, extra_opts))),
collapse = '\n')
}
)
This will break if the R markdown comment option is set to a character sequence containing a regular expression special char (e.g. *), because R doesn’t seem to have an obvious means of escaping a regular expression.
Here’s a usage example:
```{r table=TRUE}
data.frame(A=1:3, B=4:6)
```
You can pass extra arguments to the deparse function. This is necessary e.g. when the table contains NA values because read.table by default interprets them as strings:
```{r table=list(colClasses=c('numeric', 'numeric'))}
data.frame(A=c(1, 2, NA, 3), B=c(4:6, NA))
```
Far from perfect, but at least it works (for many cases).
Not exactly what you are looking for, but I am posting an answer here (that could not fit in a comment) as your described workflow is really similar to what my initial goal and use-case was when I started to work on my pander package. Although I really like the bunch of chunk options that are available in knitr, I wanted to have an engine that makes creating documents really easy, automatic and without any needed tweaks. I am aware of the fact that knitr hooks are really powerful, but I just wanted to set a few things in my Rprofile and let the literate programming tool its job without further trouble, that ended to be Pandoc.brew for me.
The main idea is to specify a few options (what markdown flavour are you using, what's your decimal mark, favorite colors for your charts etc), then simply write your report in a brew syntax without any chunk options, and the results of your code would be automatically transformed to markdown. Then convert that to pdf/docx/odt etc. with Pandoc.

Resources