RMarkdown - running the current chunk pollutes the global environment - r

Frequently, in RStudio, in a markdown file, I find myself doing command-shift-enter to run the current chunk. This pollutes the global R environment.
Is there anyway I can create a 'current environment' or 'live environment' so that anything that gets run in the console gets attached to that environment and not to the global one?

I think the answer is no, but I don't see this as a problem. You should be starting with a blank global environment every time (make sure you have "Save workspace to .RData on exit" to "Never" in the general global options).
It's a bad idea to rely on keeping variables in the global environment between sessions, because things gradually build up there, and you end up with non-reproducible results.

Related

How do I automatically have some code run everything I open up RStudio?

So for example, I want to set my global options as such:
options(stringsAsFactors = FALSE)
Sys.setenv(JAVA_HOME="C:/Program Files/Java/jre1.8.0_171")
for every RStudio session.
How can I write my code so that they are run at the beginning of each RStudio session?
Options
You can add the options script to your .Rprofile.
One of the easiest ways to access this is through the usethis library, specifically:
usethis::edit_r_profile()
The .Rproflie is always run at the start of a new session unless specifically told otherwise.
However, I only give you this with a MAJOR warning - adding code into your .Rprofile will prevent your R code from being reproducible. For this reason, I would strongly recommend you set the options call in a snippet in RStudio instead of using the .Rprofile, allowing for a keyboard shortcut to easily add to any script you run. While perhaps less convenient, I believe it's well worth the trade-off to keep your code fully reproducible. You can find more info on snippets with this support article from RStudio.
Envars
The Sys.setenv call would probably be suited well for using a .Renviron file.
Again, easily added with:
usethis::edit_r_environ()
Here is a nice reference to better explain the full use of .Rprofile and .Renviron files: https://cfss.uchicago.edu/notes/r-startup/

.First() does not execute; win7 Rgui

With this in .Rprofile (first line copied from ?Startup examples):
.First <- function() cat("\n Welcome to R!\n\n")
foo <- "bar"
I do not see the Welcome text. The following shows that .Rprofile executes.
ls()
[1] "foo"
Apparently .First() does not execute. Any idea why not?
I'm running in an Rgui console on win7pro with R v3.6.1 x64.
I already learned that I will not be able to do what I wanted to do in .First(), but I still want to know why it is not even executing. I might want to use it for something in the future. I haven't made any fancy configuration changes, and I launch the console from a shortcut to Rgui.exe.
Solved: Early on I had bad code in .First(). While troubleshooting I cleared the workspace with
rm(list=ls())
q('yes')
That way the assignment foo<-"bar" more clearly showed that .Rprofile was executing. What I didn't realize was that the bad .First() got saved in some hidden environment in .RData. After that, no matter what I did with .First() in .Rprofile, it always got replaced with the bad one. To solve the problem, I just needed to delete .RData.
Update: .First() does not go into a hidden environment, but the starting dot makes it hide from ls(). To exit with a completely clear workspace, the code would be:
rm( list=ls(all.names=T) )
q('yes')
That's a lot of typing. In the future, I think I'll just delete .RData.
To me this seems a lot like a bug. Anytime you change .First() in .Rprofile, you need to delete .RData from every folder where you use R or execute rm(.First) and `q('yes') in every folder. That just begs for something to be missed.

Knitr compiling and running all at the same time in RStudio

For running an Rnw file in RStudio, one can compile or run all. Compiling does not see the variables in the current environment, and the current environment does not see the variables created while compiling. I would like to see how the output would look when I compile, and I debug the code using the environment. This requires me to compile and run, which performs the same calculations twice, which is very impractical for large projects. Is there a way to compile and have the output be seen in the environment?
When you knit a document, the work happens in a different R session, which is why you can't examine the results in the current session.
But you have a lot of choices besides run all. Take a look at the Run button: it allows you to run chunks one at a time, or run all previous chunks, etc.
If some of your chunks take too long to run, then you should consider organizing your work differently. Put the long computations into their own script, and save the results of that script using save(). Run it once, then spend time editing the display of those results in multiple runs in the main .Rnw document.
Finally, if you really want to see variables at the end of a run of your vignette, you can add save.image(file = 'vignette.RData') at the end, and in your interactive session, use load('vignette.RData') to load the values for examination. This won't necessarily give you an accurate view of the state of things at the end of the run, because it will load the values in addition to anything you've already got in your workspace, it won't load option settings or attach packages, but it might be enough for debugging.

Why doesn't restarting R with Ctrl-Shift-F10 clear my environment variables?

I have RStudio working on two different machines: mine and a colleague's.
When I restart R in RStudio with the Ctrl-Shift-F10 shortcut, all my global environment variables go away. Not so for my colleague's, who frequently puts rm(list=ls(all=TRUE)) in our shared code.
Is there an optional parameter somewhere, so that restarting R always clears environment variables?
Disclaimer: this has 100% effectiveness based on a sample size of 5 so far (validated by OP & Badger), but I'm recording it for posterity since other forums where I've seen similar questions (example 1, example 2) don't even have that. :)
Solution: Go to Tools / Global Options / General & change the "Save workspace to .RData on exit" dropdown option to "Never".
Possible interpretation: Even if you chose the 'Ask' option in "Save workspace to .RData on exit", Ctrl-Shift-F10 shortcut won't ask before the session gets restarted. But unless you explicitly chose to NEVER save workspace on exit, RStudio will keep it somewhere just in case. (I'm not sure where, though. There's no .RData file in my working directory corresponding to the restored environment...)

How to use objects from global environment in Rstudio Markdown

I've seen similar questions on Stack Overflow but virtually no conclusive answers, and certainly no answer that worked for me.
What is the easiest way to access and use objects (regression fits, data frames, other objects) that are located in the global R environment in the Markdown (Rstudio) script.
I find it surprising that there is no easy solution to this, given the tendency of the RStudio team to make things comfortable and effective.
Thanks in advance.
For better or worse, this omission is intentional. Relying on objects created outside the document makes your document less reproducible--that is, if your document needs data in the global environment, you can't just give someone (or yourself in two years) the document and data files and let them recreate it themselves.
For this reason, and in order to perform the render in the background, RStudio actually creates a separate R session to render the document. That background R session cannot see any of the environments in the interactive R session you see in RStudio.
The best way around this problem is to take the code you used to create the contents of your global environment and move it inside your document (you can use echo = FALSE if you don't want it to show up in the document). This makes your document self-contained and reproducible.
If you can't do that, there are a few approaches you can take to use the data in the global environment directly:
Instead of using the Knit HTML button, type rmarkdown::render("your_doc.Rmd") at the R console. This will knit in the current session instead of a background session. Alternatively:
Save your global environment to an .Rdata file prior to rendering (use R's save function), and load it in your document.
Well, in my case i found the following solution:
(1) Save your Global Environmental in a .Rdata file inside the same folder where you have your .Rmd file. (You just need click at disquet picture that is on "Global Environmental" panel)
(2) Write the following code in your script of Rmarkdown:
load(file = "filename.RData") # it load the file that you saved before
and stop suffering.
Going to RStudio´s 'Tools' and 'Global options' and visiting the 'R Markdown' tab, you can make a selection in 'Evaluate chunks in directory', there select the option 'Documents' and the R Markdown knitting engine will be accessing the global environment as plain R code does. Hope this helps those who search this info!
The thread is old but in case anyone's still looking for a solution (as I was):
You can pass an envir parameter to the render() (or knit() function) so that it can access objects from the environment it was called from.
rmarkdown::render(
input = input_rmd,
output_file = output_file,
envir = parent.frame()
)
I have the same problem myself. Some stuff is pretty time consuming to reproduce every time.
I think there could be another answer. What if you save your environment with the save.image() function to a different file than the standard .Rdata one. Then, bring it back with load().
To be sure you are using the same data, use the md5sum() from tools.
Cheers, Cord
I think I solved this problem by referring to the package explicitly in the code that is being knitted. Using the yarrr package, for example, I loaded the dataframe "pirates" using data(pirates). This worked fine at the console and within an Rstudio code chunk, but with knitr it failed following the pattern in the question above. If, however, I loaded the data into memory by creating an object using pirates <- yarrr::pirates, the document then knitted cleanly to HTML.
You can load the script in the desired environment as follows:
```{r, include=FALSE}
source("your-script.R", local = knitr::knit_global())
# or sys.source("your-script.R", envir = knitr::knit_global())
```
Next in the R Markdown document, you can use objects created in these scripts (e.g., data objects or functions).
https://bookdown.org/yihui/rmarkdown-cookbook/source-script.html
One option that I have not yet seen is the use of parameters.
This chapter goes through a simple example of how to do this.

Resources