R: instructions for unbundling and using a packrat snapshot - r

I used packrat (v 0.4.8.-1) to to create a snapshot and bundle of the R package dependencies that go along with the corresponding R code. I want to provide the R code and packrat bundle to others to make the work I am doing (including the R environment) fully reproducible.
I tested unbundling using a different computer from the one I used to write R code and create the bundle. I opened an R code file in R studio, and called library(packrat) to load packrat (also v 0.4.8-1). I then called packrat::unbundle(bundle = "directory", where = "directory"), which unbundled successfully. But subsequently calling packrat::restore() gave me the error "This project has not yet been packified. Run 'packrat::init()' to init packrat". It seems like init() should not be necessary because I am not trying to create a new snapshot, but rather utilize the one in the bundle. The packrat page (https://rstudio.github.io/packrat/) and CRAN provide very little documentation about unbundling to help troubleshoot this, or that I could point users of my code to for instructions (who likely will be familiar with R, but may not have used packrat).
So, can someone please provide clear step-by-step instructions for how users of a bundled snapshot should unbundle, and then use that saved snapshot to run a R code file?

After some experimenting, I found an approach that seems to have worked so far.
I have provided users with three files:
-tar.gz (packrat bundle file)
-unbundle.R (R code file that includes a library statement to load
the packrat library, and the unbundle command for the tar.gz file)
-unbundle_readme.txt
The readme file includes instructions similar to those below, and so far users have been able to run R code using the package dependencies. The readme file tells users about requirements (R, R studio, packrat, R package development prerequisites (Rtools for Windows, XCode for Mac)), and includes output of sessionInfo() to document R package versions that the R code should use after instructions are followed. In the example below 'code_folder' refers to a folder within the tar.gz file that contains R. code and associated input files.
Example unbundle instructions:
Step 1
Save, but do not expand/unzip, the tar file to a directory.
Problems with accessing the saved package dependencies
are more likely when a program other than R or R studio
is used to unbundle the tar file.
If the tar file has already been expanded, re-save the
tar file to a new directory, which should not be a the same
directory as the expanded tar file, or a subdirectory of
the expanded tar file.
Step 2
Save unbundle.R in the same directory as the tar file
Step 3
Open unbundle.R using R studio
Step 4
Execute unbundle.R
(This will create a subfolder ‘code_folder’.
Please note that this step may take 5-15 minutes to run.)
Step 5
Close R studio
Step 6
Navigate to the subfolder ‘cold_folder’
Step 7
Open a R script using R studio
(The package library should correspond to that listed below.
This will indicate R studio is accessing the saved package
dependencies.)
Step 8
Execute the R code, which will utilize the project package library.
After the package library has been loaded using the above
steps, it is not necessary to re-load the package library for each
script. R studio will continue to access the package dependencies
for each script you open within the R studio session. If you
subsequently close R-studio, and then open scripts from within
the unbundle directory, R studio should still access the
dependencies without requiring re-loading of the saved package
snapshot.

Related

R package vignettes

I am a little confused as to why there are multiple possible locations for "vignettes" in an R package. I don't understand which locations are used for what and when. For example:
devtools::use_vignettes()
creates a vignettes folder under the root of the package
devtools::build_vignettes()
creates a inst/doc folder that gets promoted to the root at build
pkgdown::build_site()
creates a docs folder.
As background:I have read H.Wickhams R packages book and I have created several packages using the first option and all things have behaved well. I would have users install from github using:
devtools::install_github(pkg,build_vignettes=TRUE)
Now, I have just started to contribute in the joint development of a package in which the first and third option have been used. I have noticed that the .rmd file in the vignettes folder is the same as the index.html file in the docs folder. Does pkgdown copy from the vignettes folder?
Also for this package when i install from github (with build_vignettes=TRUE) i get an error saying installation failed because the doc/index.html path couldn't be found. Now why would that happen?
Vignettes development
There is only one place to put raw vignettes, it is in the vignette directory at the root. This is the place where you write your Rmd file with text and code examples, when developing your package.
Build vignettes for your users
When you build your vignettes, the Rmd file will be knit. The resulting html file, the raw Rmd file and the extraction of the R code will be three files saved in the inst/doc directory. This is what will be kept in the package installation. This is what users will be able to read.
{pkgdown}
{pkgdown} is using your Rmd files of the vignette directory to knit html files so that it can build a website for your package. It also build a page for the list of functions and a index from the Readme file that is also used for your git repository. This is not supposed to stay in the R package, and not accessible to the users. This is to present your package on the Internet.
Conclusion
Hence, when you develop, you only write your Rmd vignette in the vignette directory. The others will automatically keep what they need.

RStudio: using different package versions for each .Rproj

I have a few older R projects I'm working with, which are dependent on several currently deprecated (or heavily modified) packages. In order for everything to work smoothly I use older versions of those packages, which I have saved in another folder and load up manually to %userprofile%\documents\R\win-library\3.3 when necessary. However, this is not convenient, especially if I want to run multiple projects simultaneously, some of which requires the new and updated versions of the packages.
My question - is there a way to specify custom directories for each .Rproj from which it would take and load the libraries?
You can solve this much simpler:
Have a top-level directory for each project, call projA, projB, ...
Within each of these, create a directory libs/, say.
And within each of these directories have a file .Rprofile with a single assignment such as .libPaths("./libs")
Now when you start R in the different project directories, each will a separate library directory preceding the path, allowing you to place per-projects overrides there.
In an nutshell, the approach outlines here allows you to keep the local and modified packages around as you please. (You can even assign common directories via .libPaths() if you so choose.)
The nice things is that this will
work with any R invocation, batch or GUI or RStudio or shiny or ...
does not depend on any other tools, and hence
does not rely on RStudio or .Rprof files -- though you are free to use RStudio as well.
As so often, Base R is there for you.
One option is to use the checkpoint package by Revolution Analytics.
You can indicate for each main R file in a project the date for which you which you wish to load a set of packages. You can read a bit more about it here.
To pull snapshotted packages from a given date from the mirror use getValidSnapshots(mranRootUrl = mranUrl()).
To create a checkpoint:
# Create temporary project and set working directory
example_project <- paste0("~/checkpoint_example_project_", Sys.Date())
dir.create(example_project, recursive = TRUE)
oldwd <- setwd(example_project)
# Write dummy code file to project
cat("library(MASS)", "library(foreach)",
sep="\n",
file="checkpoint_example_code.R")
# Create a checkpoint by specifying a snapshot date
library(checkpoint)
checkpoint("2014-09-17")
# Check that CRAN mirror is set to MRAN snapshot
getOption("repos")
# Check that library path is set to ~/.checkpoint
.libPaths()
# Check which packages are installed in checkpoint library
installed.packages()
# cleanup
unlink(example_project, recursive = TRUE)
setwd(oldwd)

R package development - old version of function used in project

I am developing a package locally with devtools in RStudio. After modifying a function, when I try to call it from a project, R keeps using the old version of the function.
My workflow is to:
Modify the function and save
Call Build & Reload
Test the function with some example code in the package development
project (I often run another Build & Reload after that)
Go to the project I want to use the function in
call library(my_library)
But the modification I just did would not be effective. What is wrong with this workflow?
?devtools::build:
Building converts a package source directory into a single bundled file. If binary = FALSE this creates a tar.gz package that can be installed on any platform, provided they have a full development environment (although packages without source code can typically be install out of the box). If binary = TRUE, the package will have a platform specific extension (e.g. .zip for windows), and will only be installable on the current platform, but no development environment is needed.
My reading of this is that you still need to devtools::install() your package. Building just creates the binary, it doesn't install the new version.

How to install stringi from local file (ABSOLUTELY no Internet Access)

I am working on a remote server using RStudio. This server has no access to the Internet. I would like to install the package "stringi." I have looked at this stackoverflow article, but whenever I use the command
install.packages("stringi_0.5-5.tar.gz",
configure.vars="ICUDT_DIR=/my/directory/for/icudt.zip")
It simply tries to access the Internet, which it cannot do. Up until now I have been using Tools -> Install Packages -> Install from Packaged Archive File. However, due to this error, I can no longer use this method.
How can I install this package?
If you have no internet access on local machines, you can build a distributable source package that includes all the required
ICU data files (for off-line use) by omitting some relevant lines in
the .Rbuildignore file. The following command sequence should do the trick:
wget https://github.com/gagolews/stringi/archive/master.zip -O stringi.zip
unzip stringi.zip
sed -i '/\/icu..\/data/d' stringi-master/.Rbuildignore
R CMD build stringi-master
Assuming the most recent development version is 1.3.1,
a file named stringi_1.3.1.tar.gz is created in the current working directory.
The package can now be installed (the source bundle may be propagated via
scp etc.) by executing:
R CMD INSTALL stringi_1.3.1.tar.gz
or by calling install.packages("stringi_1.3.1.tar.gz", repos=NULL),
from within an R session.
For a Linux machine the easiest way is from my point of view:
Download the release you need from Rexamine in tar.gz format to your local pc. In opposition to the version on CRAN it already contains the icu55\data\ folder.
Move the archive to your target linux machine without internet access
run R CMD INSTALL stringi-1.0-1.tar.gz (in case of release 1.0-1)
You provided the wrong value of configure.vars.
It indicates that you have to give the directory's name, not a final file name.
Correct your code to the following:
install.packages("stringi_0.5-5.tar.gz",
configure.vars="ICUDT_DIR=/my/directory/for/")
Regards,
Sean
Follow the steps below
Download icudt55l.zip seperately from server where you have internet access with
wget http://www.mini.pw.edu.pl/~gagolews/stringi/icudt55l.zip
Copy the downloaded packages to the server where you want to install stringi
Execute the following command
R CMD INSTALL --configure-vars='ICUDT_DIR=/tmp/ALL' stringi_1.1.6.tar.gz
icudt55l.zip is copied to /tmp/ALL
The suggestion from #gagolews almost worked for me. Here's what actually did the trick with RStudio.
Download the master.zip file that will save as stringi-master.zip.
Unzip the file onto your desktop. The unzipped folder should be stringi-master.
Edit the .Rbuildignore file by removing ^src/icu55/data and ^src/icu61/data or similar lines.
Move the folder from your desktop to the home directory of your server.
Create a New Project in RStudio with ~/stringi-master as the Existing Directory
From RStudio's menu, select Build and Build Source Package. (You may need to first select Configure Build Tools. For Project build tools choose Package then select OK.)
It should create a tar.gz file, in the following format: stringi_x.x.(x+1).tar.gz. For example, if the current version of stringi is 1.5.3, it will create version 1.5.4. (I received a few warnings that didn't seem to affect the outcome.)
Move the newly created package to your local repository. Update the repository index. And install the package.

How to build a package in an existing directory with RStudio

I am working on a making a package in RStudio. I already have it as a project, and I am using Git for version control. My directory structure is currently
--Project
--R
--.git
With the R code in the R directory.
My problem occurs when I go to build the package. I want to call the package "Project" and I get the a directory structure like this:
--Project
--R
--.git
--Project
--man
--R
And in this setup there are two sets of R files, on in each "R" directory. The less nested one is being versioned by Git, the more nested one is the source for the package. Is there some way to get the package to just use the preexisting "R" directory? Can I just reanme my current "Project" directory to something else, make the package as "Project" and then copy the .git directory to the new Project directory?
In RStudio (I'm using v. 1.0.136), this can be done via the menu options Tools > Project Options > Build Tools. Select Package from the drop-down list for Project build tools, and if you know what you're doing, fill in any of the other fields as you deem appropriate.
This operation is not going to necessarily give you the structure above i.e. man and R folders or a DESCRIPTION file, which can be done with the use_description() function from the usethis package.

Resources