Im working on a large project in Rstudio with code that takes a long time to load. I was wondering if there was an efficient way that I can save the state of my workspace and variables so that I can easily reopen my workspace when needed with all variables loaded. I know a little bit about .Rdata files but I was wondering how I can utilize them to do this. Thanks!
Rstudio (even R?) automatically saves your latest state into .Rdata in your getwd() directory.
You may have deactivated this during installation, in which case you can activate it again in Tools -> Global options -> General where you'll see the option under workspace.
If this doesn't work you can do this manually using save.image() and using load('.RData') on startup.
Related
When I setwd using Rstudio IDE GUI, then I go to create a new file... the default working directory is not the current working directory. What is a better Workflow? Any .Rprofile commands to always save in the working directory?
pic 1
pic 2
As mentioned in the comments, using a project would certainly help. Projects will always set the working directory to the project root folder on opening. Additionally, previous commands and working environment is saved (if you want it to be), so you can get right back to where you were. This is especially useful if you are working on different assignments.
Additionally, you can use getwd() to get your current working directory. Remember that the RStudio file-browser doesn't update when you set your working directory in R.
When exiting R Studio, I'm usually prompted to Save workspace image to ~/.RData. I accidentally clicked Save at some point, and now my Global Environment automatically loads several functions and datasets when I open R Studio. I now have to clear all objects from the workspace when I startup R Studio.
How do I remove the data (or setting) that automatically loads the saved data so that the data is not loaded on startup?
It told you the filename. Just delete that file, i.e. ~/.RData.
And you can set the default in RStudio not to restore the workspace: it's in Tools | Global options | General | Restore .RData into workspace at startup. There's another option to say not to store it.
I have a .Rmd which I use to report on data quality in a number of different r projects. It would then split the data to remove subsets with missing data, and interpolate missing results where appropriate. It would do this via a write.csv command to a file path in the form of "./Cleansed_data/"
To make an example
open rstudio
go to the rhs 'project' menu , and select and make a new
project wherever you'd like
go to the lhs 'new script' drop down and
select 'new .Rmd'
change the output to .pdf and hit ok
in the last r
chunk include write.csv(mtcars, file = "mtcars.csv")
hit the 'knit
pdf' button, save the report as "writeFile.Rmd" to your project working directory, and
let it run.
Previously I moved this .Rmd from place to place, however now I would like to built it into an internal package. I have included it (as the documentation indicates to) into inst/rmd within the package directory.
In order to do this build or open any package you have access to
add the file to inst/rmd (create it if this doesn't exist)
rebuild the package
I then rebuild the package and open a new project. I load my new package and attempt to run the document via the render command using the system.file command to locate the .rmd like so
rmarkdown::render(input = system.file("rmd/writeFile.Rmd", package="MyPackage"),
output_file = "writeFile.pdf", output_dir = "./Cars/)
This will render the report from the package build into the folder from output_dir, however, there are a number of pitfalls here. First, if I omit the output_dir argument, the report will render into the package library, usually located in the libraries r installation in the c drive. This is however fixable.
What I can't get around is that when the .Rmd hits the write.csv() then (I believe) the .Rmd is being rendered in the package environment at the time, the working directory of which is the package library folder, not the current project directory.
The Questions
How can I inform the template in the package what the current working directory is for the rstudio project? I'm vaguely aware there might be a rstudio api package? I have nearly no understanding of what it is though, or if this would provide a solution.
If this is either outright impossible or just potentially a very bad idea how can I modify the workflow to successfully retrieve a number of r object outputs into the environment or the working directory, on the call to the report, without having to modify the report for each different project? Further, why specifically is this approach such a bad plan?
In order to close this off:
I have selected to keep the .Rmd included in the package. The .Rmd need to move and be versioned with the package as that holds the functions they use to run.
In order to meet my requirements I style the documents to grab the working directory via the rstudio api in the form.
write.csv(mtcars, file = paste0(rstudioapi::getActiveProject(), "mtcars.csv"))
Having tested #CL's answer, this also runs and is not dependant on Rstudio as an IDE, however I know that these documents will
Always be accessed via the rstudio IDE
Always be accessed from within a specific project
I fear (though have not tested) that there would be the potential for other impacts from setting the working directory for the file to be artificially booted into a different WD. Potentially this could be things like child documents I might want to include later, or other code that might need to be relevant to the file path of the package installation, not the project. In this way I think (If I interpreted Yuhui correctly) the r doc is still the centre of it's own universe. It just writes it's data into another one :)
When I start R session from some directory, R automatically loads the corresponding workspace (if it exists). After I finish to work in this workspace I can decide if I want to modify (save) the current workspace. This logic is simple and clear.
What I do not understand, is what happens if I start R from some directory and then change the working directory by setwd(). As far as I understood the workspace corresponding to the new working directory will not be "loaded". I still see the variables and history from the previous working directory. Why?
Second, when I quit() R, I replace the work-space image corresponding to the "new" working directory by the workspace corresponding to the "old" directory. Do I interpret the behavior correctly? What is the logic behind this behavior? Can I switch to another work-space from R session?
Workspaces are stored in .RData files and are automatically loaded from current working directory when you start R. But working directory itself (and setwd() function that sets it) has nothing to do with workspace. You can load any workspace by explicitly specifying any .RData file:
load("c:/project/myfile.RData")
or
setwd("c:/project/")
load()
I usually don't want to save workspace and history in R so that to keep the console clean. I am trying to put rm(list=ls()) in my .Rprofile, but it doesn't work. The workspace still restored from /Users/xxx/.RData as well as the History. Does anybody know how to set the .Rprofile to clean workspace and history automatically every time when I start R console?