Saving and retrieving temp files in R packages - r

How should I save a file in my R package?
In other words, the user of my package will download a remote text file, and I'd like that file to be available to the user the next time they load the package.
The function I've created currently saves the file using tempfile(). As it's written, the download will happen every time the function is called, and a new temp file will be created. I think this presents a problem - if the user runs the function many times, the /var folder will become bloated with temp files. To fix this, I'd like to check if the file already exists and use the existing version if it does instead of downloading it. Unfortunately, the path returned by tempfile() is lost when the function terminates, so the path is not available to check if the file exists the next time the function is called.
Can I save the file in my packge's /data directory. How do I find this directory from the user's current working directory?
Saving in a tempfile is not necessary. I'm more interested to learn how to persist data in my package.
Thank you.

You can use tempdir() and then a fixed file name:
f <- file.path(tempdir(), "mypackage_thefile.csv")
if (file.exists(f)) {
d <- read.csv(f)
} else {
# download it and save to f
}
These files are removed after each session (when the tempdir is deleted)
You can also provide the data with your package if it is not too large (read it with R and then save it as a compressed 'rds' file with saveRDS). But you can normally not have the user download it to the package directory (often write-protected).
To add data to a package, either put it in the data folder (use save to create a .RData file, for retrieval with data() ), or in the inst/extdata folder, in any appropriate format, for retrieval with
f <- system.file("extdata/filename", package="pkgname")
See the writing R extensions manual

Related

How do you edit a .rds file in RStudio?

I am trying to run an R script that I've inherited from a colleague. This script references a .rds file called config.rds. It stores some configuration settings. I need to change those settings. However, when I attempt to open the file in the Rstudio editor, a "Load R object" prompt pops up. I cannot figure out how to open the file for editing.
You can't open the file for editing - it is a binary file that stores the internal representation of R data objects.
You can only really read it into R to create a new R object, and then save a modified copy of that R object into a new or (the same) .RDS file. Example:
config = readRDS("config.rds")
config$username = "fnord"
saveRDS(config, "config.rds")

Same R history from different workspace

I am new to R and I just figured out why my history did not contain all my previous commands. R create a .Rhistory file in each working directory.
I often change working directory and I would like to have the history of all my past sessions in the same file. Is there a simple way to do that ?
Thanks.
(I am on Mac OS 10.6 and I use Rstudio)
An easy way would be to manually save your history like this:
savehistory(file = "~/.Rhistory")
and then load it when you open an R command session:
loadhistory(file = "~/.Rhistory")
Otherwise you can edit your 'Rprofile.site' and add savehistory() and loadhistory() to the functions .Last and .First respectively.
More info about Rprofile.site: here
At startup, R will source the Rprofile.site file. It will then look for a .Rprofile file to source in the current working directory. If it doesn't find it, it will look for one in the user's home directory. There are two special functions you can place in these files. .First( ) will be run at the start of the R session and .Last( ) will be run at the end of the session.

Does running R.exe create temporary files?

I'm wondering
does launching R.exe on windows create temporary files and
does interpreting something like x <- 5 write to those temporary files?
If temporary files are created where are they stored and what happens if I launch several instances of R.exe? Will they share and overwrite each others temporary files?
Each instance of R gets its own temporary directory. You can see that pretty easily below the default temporary directory on your system (eg /tmp for me; on Windows I usually set TEMPDIR and TMPDIR to C:\TMP and find them there; I forget where they go otherwise). But when you invoke tempfile() or tempdir() you can infer the path:
R> tempfile()
[1] "/tmp/RtmpDVDtmj/file6a27612c4c83"
R>
So the R session in which I typed this uses /tmp/RtmpDVDtmj/.
The directory name is randomized and safe from other R instances running at the same time.
At exit of R, the directory is purged.
And no, the simple assignment x <- 5 will not involve a temporary file.

Locate the ".Rprofile" file generating default options

In R and RStudio, I think I have messed around with the .Rprofile file a few times, and I currently am loading up an old version of it upon startup of R or RStudio, is there a way that I can quickly find the location of the file that is generating the default options?
Thanks
Like #Gsee suggested, ?Startup has all you need. Note that there isn't just the user profile file, but also a site profile file you could have messed with. And that both files can be found in multiple locations.
You could run the following to list existing files on your system among those listed on the page:
candidates <- c( Sys.getenv("R_PROFILE"),
file.path(Sys.getenv("R_HOME"), "etc", "Rprofile.site"),
Sys.getenv("R_PROFILE_USER"),
file.path(getwd(), ".Rprofile"),
file.path(Sys.getenv("HOME"), ".Rprofile"))
Filter(file.exists, candidates)
Note that it should be run on a fresh session, right after your started R, so that getwd() will return the current directory at startup. There is also the tricky possibility that your profile files do modify the current directory at startup, in which case you would have to start a "no-profile" session (run R --no-site-file --no-init-file) before running the code above.

R workspaces i.e. .R files

How do I start a new .R file default in a new session for new objects in that session?
Workspaces are .RData files, not .R files. .R files are source files, i.e. text files containing code.
It's a bit tricky. If you saved the workspace, then R saves two files in the current working directory : an .RData file with the objects and a .RHistory file with the history of commands. In earlier versions of R, this was saved in the R directory itself. With my version 2.11.1, it uses the desktop.
If you start up your R and it says : "[Previously saved workspace restored]", then it loaded the file ".RData" and ".RHistory" from the default working directory. You find that one by the command
getwd()
If it's not a desktop or so, then you can use
dir()
to see what's inside. For me that doesn't work, as I only have the file "desktop.ini" there (thank you, bloody Windoze).
Now there are 2 options : you manually rename the workspace, or use the command:
save.image(file="filename.RData")
to save the workspaces before you exit. Alternatively, you can set those options in the file Rprofile.site. This is a text file containing the code R has to run at startup. The file resides in the subdirectory /etc of your R directory. You can add to the bottom of the file something like :
fn <- paste("Wspace",Sys.Date(),sep="")
nfiles <- length(grep(paste(fn,".*.RData",sep=""),dir()))
fn <- paste(fn,"_",nfiles+1,".RData",sep="")
options(save.image.defaults=list(file=fn))
Beware: this doesn't do a thing if you save the workspace by clicking "yes" on the message box. You have to use the command
save.image()
right before you close your R-session. If you click "yes", it will still save the workspace as ".RData", so you'll have to rename it again.
I believe that you can save your current workspace using save.image(), which will default to the name ".RData". You can load a workspace simply using load().
If you're loading a pre-existing workspace and you don't want that to happen, rename or delete the .RData file in the current working directory.
If you want to have different projects with different workspaces, the easiest thing to do is create multiple directories.
There is no connection between sessions, objects and controlling files .R. In short: no need to.
You may enjoy walking through the worked example at the end of the Introduction to R - A Sample Session.
Fire up R in your preferred environment and execute the commands one-by-one.

Resources