I am working on a data set using both my laptop and a cloud facility. I want to compute some "computation-heavy" code chunks only when working on the cloud.
So far, I have chosen a not very elegant way to do so. I have added ##OPT## prefixes to segments which I want to execute only when in the cloud. I then simply remove these prefixes and run the script when in the cloud.
Now my question: is there a way that I can select in the beginning of the script once whether to execute these segments or not and then skip these segments if the argument is set to "false"? I have tried with if conditions but this is very cumbersome.
If you use an R notebook in RStudio, you can include the different code in different code chunks in the document. Code chunks are defined like:
```{r}
"hello world!"
```
Doing this allows you to quite easily run only the chunks that you want to run. Additionally, if you wish to run all of the chunks, you can do so.
Any given chunk possess an option called eval which dictates whether or not they should be run. This can take a value from an expression so you can essentially do something like:
```{r label}
is_cloud <- FALSE #or TRUE
```
```{r conditional, eval = is_cloud}
"hello world!"
```
and the chunk will be executed only if is_cloud is TRUE.
To further explain the comment of docendo discimus, just define a parameter at the beginning of your script:
execpart <- TRUE #and change to FALSE if you don't want to execute
Then wrap the whole part of your script which should only be executed situationally in:
if(execpart){
## your script
}
You could even define multiple parameters for different parts of your script at the beginning. That would give you the option to set up the execution of your script with a few quick changes.
Note that if looks for TRUE/FALSE, so you do not need to specify (execpart == TRUE) in your if-condition.
In RStudio R notebooks is is convenient to combine the nice suggestion in another answer with using "params" in the notebook header. E.g.:
---
title: params example
output: html_notebook
params:
in_cloud: yes
---
This one always runs:
```{r}
print("hello always")
```
This one only runs when in_cloud is TRUE:
```{r, eval=params$in_cloud}
print("hello, cloud")
```
Conditional blocks also work for other engines (like bash):
```{bash, eval=params$in_cloud}
echo "hello, cloud, from bash"
```
Related
Using the chunk option eval=FALSE one can suppress chunk evaluation in an RMarkdown file or R Notebook when it is knit. Is there a way to make this apply during interactive running of the document in RStudio (i.e., making "run all chunks" skip certain chunks)?
I've got some chunks in the beginning of my analysis that take a while to run, which the later sections don't depend on. I want to be able to source the important parts of the code so I can continue writing the downstream stuff, without having to do it manually chunk-by-chunk so I can avoid the parts I don't need in my workspace for further writing.
I've set up the rmarkdown document with logical parameters meant to change which parts of the code need to get run - I meant these as control flags for when the code is actually finished and getting used, but I was hoping I could use those same parameters to exclude chunks from running in interactive mode (i.e., something like eval=params$run_part1).
Setting knitr::opts_chunk and knitr::opts_hooks only help you when knitting, not in interactive mode, so while I could be wrong I'm going to tentatively say you can't control that behavior with dynamic chunk options (yet).
As a workaround you could use interactive() and if blocks so that the code is only run when knitted. It would also mesh well with your logical parameters, despite the pain of having to be in a bracket block.
---
title: "R Notebook"
output:
html_document: default
html_notebook: default
---
```{r}
if (!interactive()) {
print("long running code")
}
```
```{r}
print(2)
```
```{r}
print(3)
```
Pressing "Run All Chunks Above":
Knitting:
I'm writing a report in R Markdown in which I don't want to print any of my R code in the main body of the report – I just want to show plots, calculate variables that I substitute into the text inline, and sometimes show a small amount of raw R output. Therefore, I write something like this:
In the following plot, we see that blah blah blah:
```{r snippetName, echo=F}
plot(df$x, df$y)
```
Now...
That's all well and good. But I would also like to provide the R code at the end of the document for anybody curious to see how it was produced. Right now I have to manually write something like this:
Here is snippet 1, and a description of what section of the report
this belongs to and how it's used:
```{r snippetName, eval=F}
```
Here is snippet 2:
```{r snippetTwoName, eval=F}
```
<!-- and so on for 20+ snippets -->
This gets rather tedious and error-prone once there are more than a few code snippets. Is there any way I could loop over the snippets and print them out automatically? I'm hoping I could do something like:
```{r snippetName, echo=F, comment="This is snippet 1:"}
# the code for this snippet
```
and somehow substitute the following result into the document at a specified point when it's knitted:
This is snippet 1:
```{r snippetName, eval=F}
```
I suppose I could write some post-processing code to scan through the .Rmd file, find all the snippets, and pull out the code with a regex or something (I seem to remember there's some kind of options file you can use to inject commands into the pandoc process?), but I'm hoping there might be something simpler.
Edit: This is definitely not a duplicate – if you read my question thoroughly, the last code block shows me doing exactly what the answer to the linked question suggests (with a slight difference in syntax, which could have been the source of the confusion?). I'm looking for a way to not have to write out that last code block manually for all 20+ snippets in the document.
This is do-able within knitr, no need to use pandoc. Based on an example posted by Yihui at https://github.com/yihui/knitr-examples/blob/master/073-code-appendix.Rnw
Set echo=FALSE throughout your document: opts_chunk$set(echo = FALSE)
Then put this chunk at the end to print all code:
```{r show-code, ref.label=all_labels(), echo = TRUE, eval=FALSE}
```
This will print code for all chunks. Currently they all show up in a single block; I'd love to figure out how to put in the chunk label or some other header... For now I start my chunks with comments (probably not a bad idea in any case).
Updated: to show only the chunks that were evaluated, use:
ref.label = all_labels(!exists('engine')) - see question 40919201
Since this is quite difficult if not impossible to do with knitr, we can take advantage of the next step, the pandoc compilation, and of pandoc's ability to manipulate content with filters. So we write a normal Rmd document with echo=TRUE and the code chunks are printed as usual when they are called.
Then, we write a filter that finds every codeblock of language R (this is how a code chunk will be coded in pandoc), removes it from the document (replacing it, here, with an empty paragraph) and storing it in a list. We then add the list of all codeblocks at the end of the document. For this last step, the problem is that there really is no way to tell a python filter to add content at the end of a document (there might be a way in haskell, but I don't know it). So we need to add a placeholder at the end of the Rmd document to tell the filter to add the R code at this point. Here, I consider that the placeholder will be a CodeBlock with code lastchunk.
Here is the filter, which we could save as postpone_chunks.py.
#!/usr/bin/env python
from pandocfilters import toJSONFilter, Str, Para, CodeBlock
chunks = []
def postpone_chunks(key, value, format, meta):
if key == 'CodeBlock':
[[ident, classes, keyvals], code] = value
if "r" in classes:
chunks.append(CodeBlock([ident, classes, keyvals], code))
return Para([Str("")])
elif code == 'lastchunk':
return chunks
if __name__ == "__main__":
toJSONFilter(postpone_chunks)
Now, we can ask knitr to execute it with pandoc_args. Note that we need to remember to add the placeholder at the end of the document.
---
title: A test
output:
html_document:
pandoc_args: ["--filter", "postpone_chunks.py"]
---
Here is a plot.
```{r}
plot(iris)
```
Here is a table.
```{r}
table(iris$Species)
```
And here are the code chunks used to make them:
lastchunk
There is probably a better way to write this in haskell, where you won't need the placeholder. One could also customize the way the code chunks are returned at the end to add a title before each one for instance.
I would like to create an Rmarkdown document (pdf or html) that has some chunks "executed" conditionally. The particular case I have in mind is that I might want a more verbose and documented version of the output for internal review by colleagues, and a shorter version for external consumers. I may not want or need to show data manipulation steps to a client, but just key graphs and tables. I also do not want to make two separate documents or have to manually indicate what to show or not.
Is there a way to set a switch at the beginning of the Rmd that indicates, e.g., verbose=T that will run all chunks or verbose=F that toggles echo=F (or include=F)?
Thank you.
knitr options can be stated as R expressions. Per the "output" documentation on the knitr webpage:
Note all options in knitr can take values from R expressions, which brings the feature of conditional evaluation introduced in the main manual. In short, eval=dothis means the real value of eval is taken from a variable named dothis in the global environment; by manipulating this variable, we can turn on/off the evaluation of a batch of chunks.
In other words if you write some chunks like:
```{r label}
doNextChunk <- as.logical(rbinom(1,1,.5))
```
```{r conditional, eval = doNextChunk}
"hello world!"
```
opts_chunk$set() is what you are after. Anything "set" will be the default for subsequent chunks (unless overwritten on a chunk-by-chunk basis)
```{r setup}
library(knitr)
opts_chunk$set(eval = TRUE, include= TRUE)
````
You can then alter as you see fit.
I am preparing a .Rmd document that begins with an executive summary. I would like to include some inline R code to present a few of the key results up front; however, those results are calculated as part of the body later in the document.
Is there any way to present a result in the rendered document out of order/sequence with the actual calculation?
You could use chunk reuse in knitr (see http://yihui.name/knitr/demo/reference/). Here you would put your chunks to analyze first but not create the output, then output the summary, then the details. Here is some quick markdown knitr code to show this:
```{r chunk1, echo=FALSE, results='hide'}
x <- rnorm(1)
x
```
the value of x is `r x`.
```{r chunk2, ref.label='chunk1', echo=TRUE, results='markup', eval=2}
```
Note that the code will be evaluated twice unless you take steps to prevent this (the eval=2 in my example).
Another option would be to create 2 child documents, the first runs your main code and creates the output, the second creates the summary. Then in your parent document you include the summary first, then the detail part. I think that you will need to run knitr on these by hand so that you do it in the correct order, the automatic child document tools would probably run in the wrong order.
I am using knit()and markdownToHTML() to automatically generate reports.
The issue is that I am not outputting plots when using these commands. However, when I use RStudio's Knit HTML button, the plots get generated. When I then use my own knit/markdown function, it suddenly outputs the plot. When I switch to another document and knit that one, the old plot appears.
Example:
```{r figA, result='asis', echo=TRUE, dpi=300, out.width="600px",
fig=TRUE, fig.align='center', fig.path="figure/"}
plot(1:10)
```
Using commands:
knit(rmd, md, quiet=TRUE)
markdownToHTML(md, html, stylesheet=style)
So I guess there are 2 questions, depending on how you want to approach it:
What magic is going on in Rstudio's Knit HTML?
How can I produce/include without depending on RStudio's Knit HTML button?
The only issue I see here is that this doesn't work when you have the chunk options {...} spanning two lines. If it's all on one line, it works fine. Am I missing something?
See how this is not allowed under knitr in the documentation:
Chunk options must be written in one line; no line breaks are allowed inside chunk options;
RStudio must handle linebreaks in a non-standard way.
This is really embarrassing, I really thought I read the documentation carefully:
include: (TRUE; logical) whether to include the chunk output in the
final output document; if include=FALSE, nothing will be written into
the output document, but the code is still evaluated and plot files
are generated if there are any plots in the chunk, so you can manually
insert figures; note this is the only chunk option that is not cached,
i.e., changing it will not invalidate the cache
Simply adding {..., include=TRUE} did the trick. I would say it would be a pretty sensible default though.