RStudio: using different package versions for each .Rproj - r

I have a few older R projects I'm working with, which are dependent on several currently deprecated (or heavily modified) packages. In order for everything to work smoothly I use older versions of those packages, which I have saved in another folder and load up manually to %userprofile%\documents\R\win-library\3.3 when necessary. However, this is not convenient, especially if I want to run multiple projects simultaneously, some of which requires the new and updated versions of the packages.
My question - is there a way to specify custom directories for each .Rproj from which it would take and load the libraries?

You can solve this much simpler:
Have a top-level directory for each project, call projA, projB, ...
Within each of these, create a directory libs/, say.
And within each of these directories have a file .Rprofile with a single assignment such as .libPaths("./libs")
Now when you start R in the different project directories, each will a separate library directory preceding the path, allowing you to place per-projects overrides there.
In an nutshell, the approach outlines here allows you to keep the local and modified packages around as you please. (You can even assign common directories via .libPaths() if you so choose.)
The nice things is that this will
work with any R invocation, batch or GUI or RStudio or shiny or ...
does not depend on any other tools, and hence
does not rely on RStudio or .Rprof files -- though you are free to use RStudio as well.
As so often, Base R is there for you.

One option is to use the checkpoint package by Revolution Analytics.
You can indicate for each main R file in a project the date for which you which you wish to load a set of packages. You can read a bit more about it here.
To pull snapshotted packages from a given date from the mirror use getValidSnapshots(mranRootUrl = mranUrl()).
To create a checkpoint:
# Create temporary project and set working directory
example_project <- paste0("~/checkpoint_example_project_", Sys.Date())
dir.create(example_project, recursive = TRUE)
oldwd <- setwd(example_project)
# Write dummy code file to project
cat("library(MASS)", "library(foreach)",
sep="\n",
file="checkpoint_example_code.R")
# Create a checkpoint by specifying a snapshot date
library(checkpoint)
checkpoint("2014-09-17")
# Check that CRAN mirror is set to MRAN snapshot
getOption("repos")
# Check that library path is set to ~/.checkpoint
.libPaths()
# Check which packages are installed in checkpoint library
installed.packages()
# cleanup
unlink(example_project, recursive = TRUE)
setwd(oldwd)

Related

R: instructions for unbundling and using a packrat snapshot

I used packrat (v 0.4.8.-1) to to create a snapshot and bundle of the R package dependencies that go along with the corresponding R code. I want to provide the R code and packrat bundle to others to make the work I am doing (including the R environment) fully reproducible.
I tested unbundling using a different computer from the one I used to write R code and create the bundle. I opened an R code file in R studio, and called library(packrat) to load packrat (also v 0.4.8-1). I then called packrat::unbundle(bundle = "directory", where = "directory"), which unbundled successfully. But subsequently calling packrat::restore() gave me the error "This project has not yet been packified. Run 'packrat::init()' to init packrat". It seems like init() should not be necessary because I am not trying to create a new snapshot, but rather utilize the one in the bundle. The packrat page (https://rstudio.github.io/packrat/) and CRAN provide very little documentation about unbundling to help troubleshoot this, or that I could point users of my code to for instructions (who likely will be familiar with R, but may not have used packrat).
So, can someone please provide clear step-by-step instructions for how users of a bundled snapshot should unbundle, and then use that saved snapshot to run a R code file?
After some experimenting, I found an approach that seems to have worked so far.
I have provided users with three files:
-tar.gz (packrat bundle file)
-unbundle.R (R code file that includes a library statement to load
the packrat library, and the unbundle command for the tar.gz file)
-unbundle_readme.txt
The readme file includes instructions similar to those below, and so far users have been able to run R code using the package dependencies. The readme file tells users about requirements (R, R studio, packrat, R package development prerequisites (Rtools for Windows, XCode for Mac)), and includes output of sessionInfo() to document R package versions that the R code should use after instructions are followed. In the example below 'code_folder' refers to a folder within the tar.gz file that contains R. code and associated input files.
Example unbundle instructions:
Step 1
Save, but do not expand/unzip, the tar file to a directory.
Problems with accessing the saved package dependencies
are more likely when a program other than R or R studio
is used to unbundle the tar file.
If the tar file has already been expanded, re-save the
tar file to a new directory, which should not be a the same
directory as the expanded tar file, or a subdirectory of
the expanded tar file.
Step 2
Save unbundle.R in the same directory as the tar file
Step 3
Open unbundle.R using R studio
Step 4
Execute unbundle.R
(This will create a subfolder ‘code_folder’.
Please note that this step may take 5-15 minutes to run.)
Step 5
Close R studio
Step 6
Navigate to the subfolder ‘cold_folder’
Step 7
Open a R script using R studio
(The package library should correspond to that listed below.
This will indicate R studio is accessing the saved package
dependencies.)
Step 8
Execute the R code, which will utilize the project package library.
After the package library has been loaded using the above
steps, it is not necessary to re-load the package library for each
script. R studio will continue to access the package dependencies
for each script you open within the R studio session. If you
subsequently close R-studio, and then open scripts from within
the unbundle directory, R studio should still access the
dependencies without requiring re-loading of the saved package
snapshot.

Remove path from .libPaths() so just a single non standard path is left as library

I want to have a single library in R, which is not the default.
The idea is, to push the needed Rprofiles or environment variables out to all network computers, such that all use the same R-respository.
I added an environment variable to add the new lib, but I can't figure out how to get rid of the standard library. I don't know how to edit the Rprofile.
> Sys.getenv("R_LIBS_USER")
[1] "X:/R Repository Database"
> Sys.getenv("R_LIBS")
[1] "X:/R Respository Database"
> .libPaths()
[1] "X:/R Repository Database" "C:/ProgramFiles/R/R-3.2.5/library"
You can't change the system setting for the package directory ($R_HOME/library), nor should you. That directory contains the packages that come with R, including the base package, and it's likely that R would fail to start correctly if you tried pointing it elsewhere.
But this is really a distraction. The main sources of incompatibilities come from using different versions of user-contributed packages. Those you can control by having a site-wide package directory, which is what you've done. Incompatibilities due to different versions of system packages are really down to using different versions of R; if you want to avoid those, then install only one R version.

R setting library path via R_LIBS

I have read the R FAQS and other posts but I am a bit confused and would be grateful to know whether I did everything correctly.
In Windows, in order to modify the default library folder I created a file Renviron.site and put inside E:/Programs/R-3.3.0/etc.
The file has only one line saying
R_LIBS=E:/Rlibrary
When I open R and run .libPaths() I see E:/Rlibrary as [1] and the default R library E:/Programs/R-3.3.0/library as [2].
This should mean that from now on all packages I will install will go in E:/Rlibrary but at the same time I will be able to load and use both packages in this folder and those in the default location. Am I correct?
When you load a package via library, it will go through each directory in .libPaths() in turn to find the required package. If the package hasn't been found, you will get an error. This means you can have multiple versions of a package (in different directories), but the package that will be used is determined by the order of .libPaths().
Regarding how .libPaths() is constructed, from ?.R_LIBS
The library search path is initialized at startup from the
environment variable 'R_LIBS' (which should be a colon-separated
list of directories at which R library trees are rooted) followed
by those in environment variable 'R_LIBS_USER'. Only directories
which exist at the time will be included.

How do I change the default library path for R packages

I have attempted to install R and R studio on the local drive on my work computer as opposed to the organization network folder because anything that runs through the network is really slow. When installing, the destination path shows that it's my local C:drive. However, when I install a new package, the default path shown is my network drive and there is no option to change:
.libPaths()
[1] "\\\\The library/path/I/don't/want"
[2] "C:/Program Files/R/R-3.2.1/library"
I'm running windows 7 professional. How can I remove library path [1] and make path [2] my primary for all base packages and all new packages that I install?
Windows 7/10: If your C:\Program Files (or wherever R is installed) is blocked for writing, as mine is, then you'll get frustrated editing RProfile.site (as I did). As specified in the accepted answer, I updated R_LIBS_USER and it worked. However, even after reading the fine manual several times and extensive searching, it took me several hours to do this. In the spirit of saving someone else time...
Let's assume you want your packages to reside in C:\R\Library:
Create the folder C:\R\Library. Next I need to add this folder to the R_LIBS_USER path:
Click Start --> Control Panel --> User Accounts --> Change my environmental variables
The Environmental Variables window pops up. If you see R_LIBS_USER, highlight it and click Edit. Otherwise click New. Both actions open a window with fields for Variable and Value.
In my case, R_LIBS_USER was already there, and Value was a path to my desktop. I added to the path the folder that I created, separated by semicolon. C:\R\Library;C:\Users\Eric.Krantz\Desktop\R stuff\Packages.
(NOTE: In the last step, I could have removed the path to the Desktop location and simply left C:\R\Library).
See help(Startup) and help(.libPaths) as you have several possibilities where this may have gotten set. Among them are
setting R_LIBS_USER
assigning .libPaths() in .Rprofile or Rprofile.site
and more.
In this particular case you need to go backwards and unset whereever \\\\The library/path/I/don't/want is set.
To otherwise ignore it you need to override it use explicitly i.e. via
library("somePackage", lib.loc=.libPaths()[-1])
when loading a package.
Facing the very same problem (avoiding the default path in a network) I came up to this solution with the hints given in other answers.
The solution is editing the Rprofile file to overwrite the variable R_LIBS_USER which by default points to the home directory.
Here the steps:
Create the target destination folder for the libraries, e.g.,
~\target.
Find the Rprofile file. In my case it was at C:\Program Files\R\R-3.3.3\library\base\R\Rprofile.
Edit the file and change the definition the variable R_LIBS_USER. In my case, I replaced the this line file.path(Sys.getenv("R_USER"), "R", with file.path("~\target", "R",.
The documentation that support this solution is here
Original file with:
if(!nzchar(Sys.getenv("R_LIBS_USER")))
Sys.setenv(R_LIBS_USER=
file.path(Sys.getenv("R_USER"), "R",
"win-library",
paste(R.version$major,
sub("\\..*$", "", R.version$minor),
sep=".")
))
Modified file:
if(!nzchar(Sys.getenv("R_LIBS_USER")))
Sys.setenv(R_LIBS_USER=
file.path("~\target", "R",
"win-library",
paste(R.version$major,
sub("\\..*$", "", R.version$minor),
sep=".")
))
Windows 10 on a Network
Having your packages stored on the network drive can slow down the performance of R / R Studio considerably, and you spend a lot of time waiting for the libraries to load/install, due to the bottlenecks of having to retrieve and push data over the server back to your local host. See the following for instructions on how to create an .RProfile on your local machine:
Create a directory called C:\Users\xxxxxx\Documents\R\3.4 (or whatever R version you are using, and where you will store your local R packages- your directory location may be different than mine)
On R Console, type Sys.getenv("HOME") to get your home directory (this is where your .RProfile will be stored and R will always check there for packages- and this is on the network if packages are stored there)
Create a file called .Rprofile and place it in :\YOUR\HOME\DIRECTORY\ON_NETWORK (the directory you get after typing Sys.getenv("HOME") in R Console)
File contents of .Rprofile should be like this:
#search 2 places for packages- install new packages to first directory- load built-in packages from the second (this is from your base R package- will be different for some)
.libPaths(c("C:\Users\xxxxxx\Documents\R\3.4", "C:/Program Files/Microsoft/R Client/R_SERVER/library"))
message("*** Setting libPath to local hard drive ***")
#insert a sleep command at line 12 of the unpackPkgZip function. So, just after the package is unzipped.
trace(utils:::unpackPkgZip, quote(Sys.sleep(2)), at=12L, print=TRUE)
message("*** Add 2 second delay when installing packages, to accommodate virus scanner for R 3.4 (fixed in R 3.5+)***")
# fix problem with tcltk for sqldf package: https://github.com/ggrothendieck/sqldf#problem-involvling-tcltk
options(gsubfn.engine = "R")
message("*** Successfully loaded .Rprofile ***")
Restart R Studio and verify that you see that the messages above are displayed.
Now you can enjoy faster performance of your application on local host, vs. storing the packages on the network and slowing everything down.
I was struggling for a while with this as my work computer (with Windows 10) created the default user library on a network drive, which would slow down R and RStudio to an unusable state.
In case this helps someone, this is the easiest way I found, without requiring admin rights:
make sure the directory you want to install your packages into exists. If you want to respect the convention, use: C:\Users\username\R\win-library\rversion (for example, something like: C:\Users\janebloggs\R\win-library\3.6)
create a .Renviron file in your home directory (which might be on the network drive?), and in it, write one single line that defines the R_LIBS_USER variable to be your custom path:
R_LIBS_USER=C:\Users\janebloggs\R\win-library\3.6
(feel free to add comments too, with lines starting with #)
If a .Renviron file exists, R will read it at startup and use the variables as they are defined in there, before running the code in the .Rprofile. You can read about it in help(Startup).
Now it should be persistent between sessions!
After a couple of hours of trying to solve the issue in several ways, some of which are described here, for me (on Win 10) the option of creating a Renviron file worked, but a little different from what was written here above.
The task is to change the value of the variable R_LIBS_USER. To do this two steps needed:
Create the file named Renviron (without dot) in the folder \Program\etc\ (Program is the directory where R is installed--for example, for me it was C:\Program Files\R\R-4.0.0\etc)
Insert a line in Renviron with new path: R_LIBS_USER = "C:/R/Library"
After that, reboot R and use .libPaths() to confirm the default directory changed.
I think I tried all of the above and it didn't work for me. This worked, though:
In home directory, make a file called ".Renviron"
In that file, write:
.libPaths(new = "/my/path/to/libs")
Save and restart R if you had it open

Dependency management in R

Does R have a dependency management tool to facilitate project-specific dependencies? I'm looking for something akin to Java's maven, Ruby's bundler, Python's virtualenv, Node's npm, etc.
I'm aware of the "Depends" clause in the DESCRIPTION file, as well as the R_LIBS facility, but these don't seem to work in concert to provide a solution to some very common workflows.
I'd essentially like to be able to check out a project and run a single command to build and test the project. The command should install any required packages into a project-specific library without affecting the global R installation. E.g.:
my_project/.Rlibs/*
Unfortunately, Depends: within the DESCRIPTION: file is all you get for the following reasons:
R itself is reasonably cross-platform, but that means we need this to work across platforms and OSs
Encoding Depends: beyond R packages requires encoding the Depends in a portable manner across operating systems---good luck encoding even something simple such as 'a PNG graphics library' in a way that can be resolved unambiguously across systems
Windows does not have a package manager
AFAIK OS X does not have a package manager that mixes what Apple ships and what other Open Source projects provide
Even among Linux distributions, you do not get consistency: just take RStudio as an example which comes in two packages (which all provide their dependencies!) for RedHat/Fedora and Debian/Ubuntu
This is a hard problem.
The packrat package is precisely meant to achieve the following:
install any required packages into a project-specific library without affecting the global R installation
It allows installing different versions of the same packages in different project-local package libraries.
I am adding this answer even though this question is 5 years old, because this solution apparently didn't exist yet at the time the question was asked (as far as I can tell, packrat first appeared on CRAN in 2014).
Update (November 2019)
The new R package renv replaced packrat.
As a stop-gap, I've written a new rbundler package. It installs project dependencies into a project-specific subdirectory (e.g. <PROJECT>/.Rbundle), allowing the user to avoid using global libraries.
rbundler on Github
rbundler on CRAN
We've been using rbundler at Opower for a few months now and have seen a huge improvement in developer workflow, testability, and maintainability of internal packages. Combined with our internal package repository, we have been able to stabilize development of a dozen or so packages for use in production applications.
A common workflow:
Check out a project from github
cd into the project directory
Fire up R
From the R console:
library(rbundler)
bundle('.')
All dependencies will be installed into ./.Rbundle, and an .Renviron file will be created with the following contents:
R_LIBS_USER='.Rbundle'
Any R operations run from within this project directory will adhere to the project-speciic library and package dependencies. Note that, while this method uses the package DESCRIPTION to define dependencies, it needn't have an actual package structure. Thus, rbundler becomes a general tool for managing an R project, whether it be a simple script or a full-blown package.
You could use the following workflow:
1) create a script file, which contains everything you want to setup and store it in your projectd directory as e.g. projectInit.R
2) source this script from your .Rprofile (or any other file executed by R at startup) with a try statement
try(source("./projectInit.R"), silent=TRUE)
This will guarantee that even when no projectInit.R is found, R starts without error message
3) if you start R in your project directory, the projectInit.R file will be sourced if present in the directory and you are ready to go
This is from a Linux perspective, but should work in the same way under windows and Mac as well.

Resources