Installing earlier version of R package not on CRAN - r

I am developing an R package that will not be hosted on GitHub or submitted to CRAN. I am using git for version control. I would like to give my users the ability to load older versions of the package. I've read here about usethis::use_version() for versioning my package. This will track the versions using git, but I'm wondering if there's a straightforward way for my end users to load an older version without having to use git themselves. For packages hosted on CRAN, I know the versions package can be used to achieve this.
Right now my best solution is to create a copy of the R package in a new directory when starting work on a new version. Then the end users can load the version they want by choosing the appropriate directory. If there is a better solutions than this, I would be interested to hear it.

The remotes::install_git() function has a ref= parameter when you type a commit or tag name. If you tag your releases, then you can install which ever version you want with the correct tag. Your users don't need to run git themselves, but they will need access to the git repo to pull the correct version.
If you want to host your own repository for your users, you can also look into something like miniCRAN or drat. Since those basically a CRAN-like repository for your packages, you can probably use existing tools like the versions package to interact with the repo (assuming you keep older versions around in the same way CRAN does).

Related

Don't use R system library

I'm trying to use a linux server with R installed. Apparently the R system library has old versions of non-base packages installed like dplyr and testthat.
Because i don't have permission to edit the system library, i'm unable to update the packages.
My plan is to only use a user library, so I can controll the package versions myself. However i'm unable to remove the "/usr/lib64/R/library" folder from .libPaths(). I tried changing the environment variables R_LIBS_SITE and R_LIBS with the .Renviron and .Rprofile files to a different folder, but the /usr/lib64/R/library folder will always be present. Removing it with the command .libPaths(.libPaths()[1:2]) doesn't work either.
Is there a way to remove the system library from .libPaths(), so I'm not depending on the update policy of the server admin?
You can't remove the system library, because that's where the base packages live. They can't be installed anywhere else, and R won't work without them.
Best would be for you to get your sysadmin to update the system library. Those obsolete packages probably contain bugs.
If you can't do that, then run update.packages(instlib = "local") to install all the latest versions in the library named "local". (Substitute your own local lib name, of course.) This requires all your users to specify .libPaths("local") when they start, and some will likely forget, so it's not as good.
It might be easiest for you to just install a full copy of R in your own account. Then you'll have control of things, and anyone using your copy will get your library.
(There's a new release (3.5.3) coming in ten days; you might wait for that, or install one of the betas or RCs, which should be available now, then update again when the final release arrives.)
For me, it works to use
.libPaths(.libPaths()[2:1])
This will still search the system library, but only after it searches my personal library, so if I have a newer version, it uses that. Note: I used .libPaths()[2:1] not .libPaths()[1:2]

Automatically update package

I have written an R package, hosted on Github, that I would like users to always run the most up-to-date version. What is the best practice for checking within R, preferably at package loading, if there is a newer version? And if there is a newer version to download and install it.
I know I can get the current installed version using packageVersion('MyPackage').
I guess one way to get the repository version number would be to download the DESCRIPTION file, and use some regex to search for the version number. But is there a better way?
Having established that an update is available, is there then a safe way to install it from within the same package at loading. For example having something like
if (github_ver > installed_ver) {
install_github('MyRepository/MyPackage')
}
embedded in the .onLoad() hook? Although this way looks risky to me.

Pinning R package versions

How do you best pin package versions in R?
Rejected strategy 1: Pin to CRAN source tar.gzs
Doesn't work if you want to pin it at the latest version since CRAN does not put the tip version in the archive (duh)
Rejected strategy 2: Use devtools
Don't want to, because it takes ages to compile and adds lots of stuff I don't want to use
Rejected strategy 3: Vendor
Would rather avoid having to copy all source
To provide a little bit more information on packrat, which I use for this purpose. From the website.
R package dependencies can be frustrating. Have you ever had to use
trial-and-error to figure out what R packages you need to install to
make someone else’s code work–and then been left with those packages
globally installed forever, because now you’re not sure whether you
need them? Have you ever updated a package to get code in one of your
projects to work, only to find that the updated package makes code in
another project stop working?
We built packrat to solve these problems. Use packrat to make your R
projects more:
Isolated: Installing a new or updated package for one project won’t
break your other projects, and vice versa. That’s because packrat
gives each project its own private package library. Portable: Easily
transport your projects from one computer to another, even across
different platforms. Packrat makes it easy to install the packages
your project depends on. Reproducible: Packrat records the exact
package versions you depend on, and ensures those exact versions are
the ones that get installed wherever you go.
Packrat stores the version of the packages you use in the packrat.lock file, and then downloads that version from CRAN whenever you packrat::restore(). It is much lighter weight than devtools, but can still take some time to re-download all of the packages (depending on the packages you are using).
If you prefer to store all of the sources in a zip file, you can use packrat::snapshot() to pull down the sources / update the packrat.lock and then packrat::bundle() to "bundle" everything up. The aim for this is to make projects / research reproducible and portable over time by storing the package versions and dependencies used on the original design (along with the source code so that the OS dependency on a binary is avoided).
There is much more information on the website I linked to, and you can see current activity on the git repo. I have encountered a few cases that work in a less-than-ideal way (packages not on CRAN have some issues at times), but the git repo still seems to be pretty active with issues/patches which is encouraging.

What's a good strategy for saving old R package versions on GitHub?

The development of RStudio and the packages devtools and roxygen2 has made R package creation pretty easy. I use GitHub for version control and devtools allows others to easily install directly from my account.
As my package gradually changes with each version, I'm wondering if I should be maintaining .zip files (or other format) of my past stable builds, in case anyone would ever want to use a previous version.
It's easy to download a .zip of an R package directly from GitHub, but I'm wondering if I should add this to the same GitHub directory (e.g. https://github.com/myaccount/mypackage/previous_versions/mypackage_0.1.zip) without messing up somebody's installation via install_github("myaccount/mypackage").
So, the main Qs are:
Should I keep an old package version at all?
Should I keep old package versions in a sub-folder of my GitHub R package directory?
Should I save .zip files downloaded from GitHub as my old version, or produce a Source or Binary file during the package build itself (i.e. in RStudio)?
Is this a superfluous activity if one isn't yet willing to publish to CRAN?!
When you think your package is at a good solid place, you should tag a release. This archives the branch at that point in time and stores the zip file with the source code, and the tar.gz file.
I tend to mark my CRAN packages as a release each time I release it to CRAN (for example, see https://github.com/nutterb/pixiedust/releases) and with some intemediary tags that I consider noteworthy.
Another good strategy for managing changes in between tagged releases is to maintain a development branch below your main branch. That way your development changes won't pollute or break anything being used by those pulling from your main branch. It makes you free to experiment in the dev branch while always having a clean, working copy to push to and restore from.
1. Should I keep an old package version at all?
It's subjective, but I'd definitely say "yes" unless there's a space constraint, which is probably unlikely.
This serves 2 purposes. One is for your own convenience, such as if you want to make sure that you always have a quick way to test the results of older versions versus a newer version.
The other is that people often need older versions of packages, such as if someone wants to use your package but they're using an older version of R on a server where the policies prevent an update to R. Perhaps a newer version of your package includes a new dependency which only works with a package that depends on a certain version of R or higher.
Of course, packages can always be installed without the compressed or binary files, but it's a nice convenience.
2. Should I keep old package versions in a sub-folder of my GitHub R package directory?
I would put it in a trunk or special subfolder that won't be automatically downloaded when someone tries to install_github or clone your master branch. Having a separate branch is a good idea.
3. Should I save .zip files downloaded from GitHub as my old version, or produce a Source or Binary file during the package build itself (i.e. in RStudio)?
As the package author you're in a position to know if these differ significantly and which if either is better, but by default I'd recommend the RStudio build because I assume (if you're like me) that you're less likely to include unnecessary files this way.
4. Is this a superfluous activity if one isn't yet willing to publish to CRAN?!
No, not necessarily. If people rely on your package then it really doesn't matter if it's on CRAN or not. In fact, not being on CRAN may be a reason to be more proactive like this to ensure that your users will always have access to the needed version of your package.

R - Add custom packages to cloned version of CRAN

At my company we have a server where we host, for internal use only, a clone of CRAN (refreshed only when new versions of R come out). We do this to allow internal servers to install packages from R without needing internet access and it helps ensure that everyone at the company is always using the same version of packages (or can easily update to get caught up).
Recently we've been making some custom internal packages. This tend to be convenience wrapper packages built explicitly around our systems, they would be of zero use to anyone outside our company so I don't want to try and submit them to the official CRAN.
How do I 'submit' them to our cloned CRAN so they can be installed via install.pacakges('blahblah') instead of me having to email out copies of the packages and upload them to each server?
You want drat to inject packages into a repo -- any repo -- and drat does not care if your repo is a 0% or 100% clone of CRAN, or anywhere in between.
A repo is still just a repo: a collection of source tarballs [and maybe binary packages if you have the (mis)fortune to rely on Windows too] and you simply need to update the PACKAGES files.
We run a local repo at work with our very much non-public packages for the same reason.

Resources