ESS & Knitr/Sweave: How to source the Rnw file into an interactive session? - r

This is a terribly simple request, and I can't believe I haven't found the solution to this yet, but I've been searching for it far and wide without luck.
I have an .Rnw file loaded up in Emacs, I use M-n s to compile it.
Everything works well, and it even opens an R buffer. Great. But that buffer
is entirely useless: it doesn't contain the objects that I just sourced!
Example minimal .Rnw file:
\documentclass{article}
\begin{document}
<<>>=
foo <- "bar"
#
\end{document}
Using M-n s, I now have a new R-buffer with a session loaded up, but:
> foo
Error: object 'foo' not found
That is disappointing. I would like to play around with the data interactively.
How do I achieve that? I don't want to be sourcing the file line-by-line, or
region-by-region with C-c C-c or something similar every time I change my code.
Ideally, it should be just like RStudio's source function, that leaves me with
a fully prepared R session.
I haven't tried this with sweave yet, only with knitr.
EDIT: the eval=TRUE chunk option does not seem to result in the correct behaviour.

This behaviour was recently changed in ESS. Now sweave and knitr are executed directly in the global environment, as if when you write it yourself at command line. So wait for a couple of more weeks till ESSv13.09 is out or use the development version.
Alternatively, you can also set ess-swv-processing-command to "%s(%s)" and you will get the same result, except automatic library loading.
For the record, knitr (in contrast to sweave) evaluates everything in it's own environment unless you instruct it otherwise.
[edit: Something went wrong. I don't see the correct .ess_weave any more. Probably some git commit messup again. So it is not fixed in 13.09. Fixing it now. Sorry.]

Open an interactive R session, and then call Sweave directly, I believe like this (untested though). knitr works in the same way, though you need to load the knitr library first.
> Sweave("yourfile.Rnw")
There is some potential for peril here, though. If you call Sweave in a session after doing other things, your code can use things previously in the workspace, thus making your results unreproducible.

Related

Difference: "Compile PDF" button in RStudio vs. knit() and knit2pdf()

TL;DR
What are the (possibly unwanted) side-effects of using knit()/knit2pdf() instead of the "Compile PDF"1 button in RStudio?
Motivation
Most users of knitr seem to write their documents in RStudio and compile the documents using the "Compile PDF" / "Knit HTML" button. This works smoothly most of the time, but every once a while there are special requirements that cannot be achieved using the compile button. In these cases, the solution is usually to call knit()/knit2pdf()/rmarkdown::render() (or similar functions) directly.
Some examples:
How to knit/Sweave to a different file name?
Is there a way to knitr markdown straight out of your workspace using RStudio?
Insert date in filename while knitting document using RStudio Knit button
Using knit2pdf() instead of the "Compile PDF" button usually offers a simple solution to such questions. However, this comes at a price: There is the fundamental difference that "Compile PDF" processes the document in a separate process and environment whereas knit2pdf() and friends don't.
This has implications and the problem is that not all of these implications are obvious. Take the fact that knit() uses objects from the global environment (whereas "Compile PDF" does not) as an example. This might be obvious and the desired behavior in cases like the second example above, but it is an unexpected consequence when knit() is used to overcome problems like in example 1 and 3.
Moreover, there are more subtle differences:
The working directory might not be set as expected.
Packages need to be loaded.
Some options that are usually set by RStudio may have unexpected values.
The Question and it's goal
Whenever I read/write the advice to use knit2pdf() instead of "Compile PDF", I think "correct, but the user should understand the consequences …".
Therefore, the question here is:
What are the (possibly unwanted) side-effects of using knit()/knit2pdf() instead of the "Compile PDF" button in RStudio?
If there was a comprehensive (community wiki?) answer to this question, it could be linked in future answers that suggest using knit2pdf().
Related Questions
There are dozens of related questions to this one. However, they either propose only code to (more or less) reproduce the behavior of the RStudio button or they explain what "basically" happens without mentioning the possible pitfalls. Others look like being very similar questions but turn out to be a (very) special case of it. Some examples:
Knit2html not replicating functionality of Knit HTML button in R Studio: Caching issue.
HTML outputs are different between using knitr in Rstudio & knit2html in command line: Markdown versions.
How to convert R Markdown to HTML? I.e., What does “Knit HTML” do in Rstudio 0.96?: Rather superficial answer by Yihui (explains what "basically" happens) and some options how to reproduce the behavior of the RStudio button. Neither the suggested Sys.sleep(30) nor the "Compile PDF" log are insightful (both hints point to the same thing).
What does “Knit HTML” do in Rstudio 0.98?: Reproduce behavior of button.
About the answer
I think this question raised many of the issues that should be part of an answer. However, there might be many more aspects I don't know about which is the reason why I am reluctant to self-answer this question (though I might try if nobody answers).
Probably, an answer should cover three main points:
The new session vs. current session issue (global options, working directory, loaded packages, …).
A consequence of the first point: The fact that knit() uses objects from the calling environment (default: envir = parent.frame()) and implications for reproducibility. I tried to tackle the issue of preventing knit() from using objects from outside the document in this answer (second bullet point).
Things RStudio secretly does …
… when starting an interactive session (example) --> Not available when hitting "Compile PDF"
… when hitting "Compile PDF" (anything special besides the new session with the working directory set to the file processed?)
I am not sure about the right perspective on the issue. I think both, "What happens when I hit 'Compile PDF' + implications" as well as "What happens when I use knit() + implications" is a good approach to tackle the question.
1 The same applies to the "Knit HTML" button when writing RMD documents.
First of all, I think this question is easier to answer if you limit the scope to the "Compile PDF" button, because the "Knit HTML" button is a different story. "Compile PDF" is only for Rnw documents (R + LaTeX, or think Sweave).
I'll answer your question following the three points you suggested:
Currently RStudio always launch a new R session to compile Rnw documents, and first changes the working directory to the directory of the Rnw file. You can imagine the process as a shell script like this:
cd path/to/your-Rnw-directory
Rscript -e "library(knitr); knit('your.Rnw')"
pdflatex your.tex
Note that the knitr package is always attached, and pdflatex might be other LaTeX engines (depending on your RStudio configurations for Sweave documents, e.g., xelatex). If you want to replicate it in your current R session, you may rewrite the script in R:
owd = setwd("path/to/your-Rnw-directory")
system2("Rscript", c("-e", shQuote("library(knitr); knit('your.Rnw')"))
system2("pdflatex", "your.tex")
setwd(owd)
which is not as simple as knitr::knit('path/to/your.Rnw'), in which case the working directory is not automatically changed, and everything is executed in the current R session (in the globalenv() by default).
Because the Rnw document is always compiled in a new R session, it won't use any objects in your current R session. This is hard to replicate only through the envir argument of knitr::knit() in the current R session. In particular, you cannot use knitr::knit(envir = new.env()) because although new.env() is a new environment, it has a default parent environment parent.frame(), which is typically the globalenv(); you cannot use knitr::knit(envir = emptyenv()), either, because it is "too clean", and you will have trouble with objects even in the R base package. The only reliable way to replicate what the "Compile PDF" button does is what I said in 1: system2("Rscript", c("-e", shQuote("library(knitr); knit('your.Rnw')")), in which case knit() uses the globalenv() of a new R session.
I'm not entirely sure about what RStudio does for the repos option. It probably automatically sets this option behind the scenes if it is not set. I think this is a relatively minor issue. You can set it in your .Rprofile, and I think RStudio should respect your CRAN mirror setting.
Users have always been asking why the Rnw document (or R Markdown documents) are not compiled in the current R session. To us, it basically boils down to which of the following consequences is more surprising or undesired:
If we knit a document in the current R session, there is no guarantee that your results can be reproduced in another R session (e.g., the next time you open RStudio, or your collaborators open RStudio on their computers).
If we knit a document in a new R session, users can be surprised that objects are not found (and when they type the object names in the R console, they can see them). This can be surprising, but it is also a good and early reminder that your document probably won't work the next time.
To sum it up, I think:
Knitting in a new R session is better for reproducibilty;
Knitting in the current R session is sometimes more convenient (e.g., you try to knit with different temporary R objects in the current session). Sometimes you also have to knit in the current R session, especially when you are generating PDF reports programmatically, e.g., you use a (for) loop to generate a series of reports. There is no way that you can achieve this only through the "Compile PDF" button (the button is mostly only for a single Rnw document).
BTW, I think what I said above can also apply to the Knit or Knit HTML buttons, but the underlying function is rmarkdown::render() instead of knitr::knit().

How to use objects from global environment in Rstudio Markdown

I've seen similar questions on Stack Overflow but virtually no conclusive answers, and certainly no answer that worked for me.
What is the easiest way to access and use objects (regression fits, data frames, other objects) that are located in the global R environment in the Markdown (Rstudio) script.
I find it surprising that there is no easy solution to this, given the tendency of the RStudio team to make things comfortable and effective.
Thanks in advance.
For better or worse, this omission is intentional. Relying on objects created outside the document makes your document less reproducible--that is, if your document needs data in the global environment, you can't just give someone (or yourself in two years) the document and data files and let them recreate it themselves.
For this reason, and in order to perform the render in the background, RStudio actually creates a separate R session to render the document. That background R session cannot see any of the environments in the interactive R session you see in RStudio.
The best way around this problem is to take the code you used to create the contents of your global environment and move it inside your document (you can use echo = FALSE if you don't want it to show up in the document). This makes your document self-contained and reproducible.
If you can't do that, there are a few approaches you can take to use the data in the global environment directly:
Instead of using the Knit HTML button, type rmarkdown::render("your_doc.Rmd") at the R console. This will knit in the current session instead of a background session. Alternatively:
Save your global environment to an .Rdata file prior to rendering (use R's save function), and load it in your document.
Well, in my case i found the following solution:
(1) Save your Global Environmental in a .Rdata file inside the same folder where you have your .Rmd file. (You just need click at disquet picture that is on "Global Environmental" panel)
(2) Write the following code in your script of Rmarkdown:
load(file = "filename.RData") # it load the file that you saved before
and stop suffering.
Going to RStudio´s 'Tools' and 'Global options' and visiting the 'R Markdown' tab, you can make a selection in 'Evaluate chunks in directory', there select the option 'Documents' and the R Markdown knitting engine will be accessing the global environment as plain R code does. Hope this helps those who search this info!
The thread is old but in case anyone's still looking for a solution (as I was):
You can pass an envir parameter to the render() (or knit() function) so that it can access objects from the environment it was called from.
rmarkdown::render(
input = input_rmd,
output_file = output_file,
envir = parent.frame()
)
I have the same problem myself. Some stuff is pretty time consuming to reproduce every time.
I think there could be another answer. What if you save your environment with the save.image() function to a different file than the standard .Rdata one. Then, bring it back with load().
To be sure you are using the same data, use the md5sum() from tools.
Cheers, Cord
I think I solved this problem by referring to the package explicitly in the code that is being knitted. Using the yarrr package, for example, I loaded the dataframe "pirates" using data(pirates). This worked fine at the console and within an Rstudio code chunk, but with knitr it failed following the pattern in the question above. If, however, I loaded the data into memory by creating an object using pirates <- yarrr::pirates, the document then knitted cleanly to HTML.
You can load the script in the desired environment as follows:
```{r, include=FALSE}
source("your-script.R", local = knitr::knit_global())
# or sys.source("your-script.R", envir = knitr::knit_global())
```
Next in the R Markdown document, you can use objects created in these scripts (e.g., data objects or functions).
https://bookdown.org/yihui/rmarkdown-cookbook/source-script.html
One option that I have not yet seen is the use of parameters.
This chapter goes through a simple example of how to do this.

is it possible to compile R latex via knitr in a modula way

Is there any way to compile knitr subfiles separately? What I have in mind is something like the package subfiles for latex just in combination with R/knitr/Sweave?
This would be great in case one has two exercises a first exercise with heavy computations and
don't want to compile the entire exercise always while working and testing the second one.
The patchDVI package does this for Sweave. I imagine it would be possible (maybe even easy) to modify it to do the same for knitr.
For example, in Sweave, you define variables in a chunk like so:
<<>>=
.TexRoot <- "main.tex"
.SweaveFiles <- c("subfile1.Rnw", "subfile2.Rnw")
#
and after Sweave is finished running that file, patchDVI will check whether the files subfile1.Rnw and subfile2.Rnw also need to be run, then will run LaTeX on the main.tex file once everything is up to date.
You don't need to do anything difficult, just use the cache options. Lots of details here, but it's probably as simple as specifying cache = T in the chunk options of your first exercise.

Can knit2pdf use the global environment?

In this answer, #Yihui said that knitr uses the global environment. This confused me--my experience had been that it doesn't. I never really use knit though, I usually go straight to PDF.
In a little experiment, it seems that knit does use the global environment (or whatever environment you specify using the envir argument), but that knit2pdf doesn't.
Minimal example: global_test.Rnw file
\documentclass{article}
\begin{document}
<<>>=
print(x)
#
\end{document}
R Script:
x <- "Hello World"
knit(input="global_test.Rnw")
# Works as expected, could now call tools::texi2pdf to generate pdf.
knit2pdf(input="global_test.Rnw")
# Doesn't
The latter generates PDF file that won't display and gives a warning:
running command '"C:\PROGRA~2\MIKTEX~1.9\miktex\bin\texi2dvi.exe" --quiet --pdf
"global.pdf" -I "C:/PROGRA~1/R/R-215~1.3/share/texmf/tex/latex" -I
"C:/PROGRA~1/R/R-215~1.3/share/texmf/bibtex/bst"' had status 1
I tried passing an environment to knit2pdf (envir = globalenv()) hoping it would be ... passed on, I just get an unused argument error.
Generally, I know that referencing the global environment is poor form, but is there a way to do it with knit2pdf, or to pass an environment explicitly, or am I better off using brew and sprintf as in #Ramnath's answer to the same question above?
In my use case, I don't think tools::texi2pdf is useful because I need to compile with XeLaTeX, which knit2pdf handles effortlessly.
The problems with the examples in the question seem to have nothing to do with environments. Everything will compile correctly and without warnings if the output argument is left off of knit2pdf.
For reference, I was using knitr 1.1 on R 2.15.3 on Windows 7. I'll let Yihui know as it appears to be a bug in knit2pdf (which calls tools::texi2pdf, which doesn't accept an output file path).
UPDATE: The problem has been fixed in the development version of knitr, available here.
It's also worth noting that the Compile PDF button in RStudio does not use your current environment, so if you want to have access to global variables and you are using RStudio, make an explicit call to the appropriate knit function rather than relying on the shortcut. In fact, it doesn't use knit2pdf directly, rather it calls rmarkdown::render.

Sweave syntax highlighting in output

Has anyone managed to get color syntax-highlighting working in the output of Sweave documents? I've been able to customize the output style by adding boxes, etc. in the Sweave.sty file as follows:
\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontseries=bc,frame=single}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{frame=leftline}
\DefineVerbatimEnvironment{Scode}{Verbatim}{fontseries=bc}
And I can get the minted package to do syntax highlighting of verbatim-code blocks in my document like so:
\begin{minted}{perl}
use Foo::Bar;
...
\end{minted}
but I'm not sure how to combine the two for R input sections. I tried the following:
\DefineVerbatimEnvironment{Sinput}{minted}{r}
\DefineVerbatimEnvironment{Scode}{minted}{r}
Any suggestions?
Yes, look at some of the vignettes for Rcpp as for example (to pick just one) the Rcpp-FAQ pdf.
We use the highlight by Romain which itself can farm out to the hightlight binary by Andre Simon. It makes everything a little more involved---Makefiles for the vignettes etc pp---but we get colourful output from R and C/C++ code. Which makes it worth it.
I have a solution that has worked for me, I have not tried it on any other systems though so things may not work out of the box for you. I've posted some code at https://gist.github.com/797478 that is a set of modified Rweave driver functions that make use of minted blocks instead of verbatim blocks.
To use this driver just specify it when calling the Sweave function with the driver=RweaveLatexMinted() option.
Here's how I've ended up solving it, starting from #daroczig's suggestion.
\usepackage{minted}
\renewenvironment{Sinput}{\minted[frame=single]{r}}{\endminted}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{frame=leftline}
\DefineVerbatimEnvironment{Scode}{Verbatim}{}
While I was at it, I needed to get caching working because I'm using large data sets and one chunk was taking around 3 minutes to complete. So I wrote this zsh shell function to process an .Rnw file with caching:
function sweaveCache() {
Rscript -e "library(cacheSweave); setCacheDir(getwd()); Sweave('$1.Rnw', driver = cacheSweaveDriver)" &&
pdflatex --shell-escape $1.tex &&
open $1.pdf
}
Now I just do sweaveCache myFile and I get the result opened in Preview (on OS X).
This topic on tex.StackExchange might be interesting for you, as it suggest loading the SweaveListingUtils package in R for easy solution.

Resources