I am developing R packages for an internal use applications at work. Unfortunately, not everybody is on the same version of R. I want to build Windows binaries of my package to support multiple versions, for example, 3.6.x and 4.0.x R.
I can easily do this by building the package (I use devtools::build(binary = TRUE)), and then change the R version in RStudio, restart, and run again. But this gets very tedious.
Is there a way to streamline this (e.g., my own custom function to build both at once)? I imagine some CI/CD thing is probably best, but this solution would have to be limited to what I can run locally.
Don't do it in RStudio, write a script to do it. It would have commands like these:
/path/to/R3.6.3/R CMD INSTALL --build /path/to/yourpackage
mv yourpackage.*.zip /path/for/R3.6users
/path/to/R4.0.3/R CMD INSTALL --build /path/to/yourpackage
mv yourpackage.*.zip /path/for/R4.0users
You don't need a lot of builds; only the first two parts of the version number (e.g. 3.6 or 4.0) need to match the target system.
You could implement this script in R using system() calls, but it's probably simpler to do it using one of the Windows script languages (.bat or .cmd or whatever).
Related
I would like to use the R environment I installed with conda inside Visual Studio Code (on Macos). First I installed R with conda.
But how do I use/activate the environment in Visual Studio Code? In the settings I can't find the equivalent to "Python: Select Interpreter" or "python.venvPath"
Thanks!
R support in VSCode is handled by a 3rd party extension. The most popular one is R by Yuki Ueda and there is also R Tools by Mikhail Arkhipov
For both of these, you can change the R interpreter to use in the settings.
However, there is no built-in support for Anaconda, mostly because it isn't that popular or necessary in the R community. Most people use the standard R installation instead and most help resources are written for that type of installation: https://cloud.r-project.org/bin/macosx/
It has been 2 years since this entry and the extension still doesn't support conda environments.
For my configuration (I've R installed in a conda environment), I found a pretty painless work around:
open 'vscode'
install the extension and configure it as suggested using the conda paths for both R and, if you have it installed, radian
close 'vscode'
open a terminal
activate your conda environment
start vscode from your terminal using code
After this, everything seems to be up and running correctly. You can start an R terminal using the command palette and, as you run your code, you should be able to see all the information about the environment and namespaces as well as your plots.
R CMD check automatically runs tests located in tests/ directory. However running the tests this way requires building the package first. After that R CMD check goes through various different sanity checks before finally reaching the tests at the end.
Question: Is there a way to run those tests without having to build or install the package first?
NOTE: without using testthat or other non-standard packages.
To summarise our discussion.
To my knowledge there is no standard alternative to R CMD check for unit testing provided by base R
Typically for unit testing, I source everything under R/ (and dyn.load everything under source/) and then source everything under tests/ (actually, I also use the Example sections of the help pages in the man/ directory as test cases and compare their outcome to those from previous package versions)
I assume that these are the basic testing functionalities provided by devtools and testthat. If you expect to develop multiple packages and want to stay independent from non-base-R, I'd recommed to automate the above processes with custom scripts/packages.
I'd recomment looking into http://r-pkgs.had.co.nz/tests.html.
I can only find information on how to install a ready-made R extension package, but it is nowhere mentioned which commands a developer of an extension package has to use during daily development. I am using Rcpp and I am on Windows.
If this were a typical C++ project, it would go like this:
edit
make # oops, typo
edit # fix typo
make # oops, forgot an #include
edit
make # good; updates header dependencies for subsequent 'make' automatically
./fooreader # test it
make install # only now I'm ready
Which commands do I need for daily development of an Rcpp package project?
I've allocated a skeleton project using these commands from the R command line:
library(Rcpp)
Rcpp.package.skeleton("FooReader", example_code=FALSE,
author="My Name", email="my.email#example.com")
This allocated 3 files:
DESCRIPTION
NAMESPACE
man/FooReader-package.Rd
Now I dropped source code into
src/readfoo.cpp
with these contents:
#include <Rcpp.h>
#error here
I know I can run this from the R command line:
Rcpp::sourceCpp("D:/Projects/FooReader/src/readfoo.cpp")
(this does run the compiler and indicates the #error).
But I want to develop a package ultimately.
There is no universal answer for everybody, I guess.
For some people, RStudio is everything, and with some reason. One can use the package creation facility to create an Rcpp package, then edit and just hit the buttons (or keyboard shortcuts) to compile and re-load and test.
I also work a lot on a shell, so I do a fair amount of editing in Emacs/ESS along with R CMD INSTALL (where thanks to ccache recompilation of unchanged code is immediate) with command-line use via r of the littler package -- this allows me to write compact expressions loading the new package and evaluating: r -lnewpackage -esomeFunc(somearg) to test newpackage::someFunc() with somearg.
You can also launch the build and test from Emacs. As I said, it all depends.
Both those answers are for package, where I do real work. When I just test something in a single file, I do that in one Emacs buffer and sourceCpp() in an R session in another buffer of the same Emacs. Or sometimes I edit in Emacs and run sourceCpp() in RStudio.
There is no one answer. Find what works for you.
Also, the first part of your question describes the initial setup of a package. That is not part of the edit/compile/link/test cycle as it is a one off. And for that too do we have different approaches many of which have been discussed here.
Edit: The other main misunderstanding of your question is that once you have package you generally do not use sourceCpp() anymore.
In order to test an R package, it has to be installed into a (temporary) library such that it can be attached to a running R process. So you will typically need:
R CMD build . to build package_version.tar.gz
R CMD check <package_version.tar.gz> to test your package, including tests placed into the testsfolder
R CMD INSTALL <package_version.tar.gz> to install it into a library
After that you can attach the package and test it. Quite often I try to use a more TTD approach, which means I do not have to INSTALL the package. Running the unit tests (e.g. via R CMD check) is enough.
All that is independent of Rcpp. For a package using Rcpp you need to call Rcpp::compileAttributes() before these steps, e.g. with Rscript -e 'Rcpp::compileAttributes()'.
If you use RStudio for package development, it offers a lot of automation via the devtools package. I still find it useful to know what has to go on under the hood and it is by no means required.
I want to run an R command from command line (actually, from within a Makefile). The command is roxygen2::roxygenise(), if it is relevant. I don't want to create a new file and run that as a script - that will just clutter my directory.
In python, this is simple - you write python -c "import antigravity".
I use the Makefile to build, install and test a (Rcpp) package I'm working on.
This is generally done with so 'shebang scripts'.
Historically, littler was there first, about a decade or so ago. It is still widely used, and contains a number of helper scripts as for example roxy.r which does just what you desire: run roxygen2::roxygenize(). I use this all the time.
Next, Rscript started to ship with R. It is similar to littler but automatically available whereever R is which is a plus. On the minus side, it starts slower, and fails to load the methods package which is a source of a number of bug reports and SO questions.
Much more recently, R itself added the ability to run expressions following the -e ... switch.
So you have plenty of choices. You can also study plenty of src/Makevars files many of which use Rscript.
I've found several posts about best practice, reproducibility and workflow in R, for example:
How to increase longer term reproducibility of research (particularly using R and Sweave)
Complete substantive examples of reproducible research using R
One of the major preoccupations is ensuring portability of code, in the sense that moving it to a new machine (possibly running a different OS) is relatively straightforward and gives the same results.
Coming from a Python background, I'm used to the concept of a virtual environment. When coupled with a simple list of required packages, this goes some way to ensuring that the installed packages and libraries are available on any machine without too much fuss. Sure, it's no guarantee - different OSes have their own foibles and peculiarities - but it gets you 95% of the way there.
Does such a thing exist within R? Even if it's not as sophisticated. For example simply maintaining a plain text list of required packages and a script that will install any that are missing?
I'm about to start using R in earnest for the first time, probably in conjunction with Sweave, and would ideally like to start in the best way possible! Thanks for your thoughts.
I'm going to use the comment posted by #cboettig in order to resolve this question.
Packrat
Packrat is a dependency management system for R. Gives you three important advantages (all of them focused in your portability needs)
Isolated : Installing a new or updated package for one project won’t break your other projects, and vice versa. That’s because packrat gives each project its own private package library.
Portable: Easily transport your projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on.
Reproducible: Packrat records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go.
What's next?
Walkthrough guide: http://rstudio.github.io/packrat/walkthrough.html
Most common commands: http://rstudio.github.io/packrat/commands.html
Using Packrat with RStudio: http://rstudio.github.io/packrat/rstudio.html
Limitations and caveats: http://rstudio.github.io/packrat/limitations.html
Update: Packrat has been soft-deprecated and is now superseded by renv, so you might want to check this package instead.
The Anaconda package manager conda supports creating R environments.
conda create -n r-environment r-essentials r-base
conda activate r-environment
I have had a great experience using conda to maintain different Python installations, both user specific and several versions for the same user. I have tested R with conda and the jupyter-notebook and it works great. At least for my needs, which includes RNA-sequencing analyses using the DEseq2 and related packages, as well as data.table and dplyr. There are many bioconductor packages available in conda via bioconda and according to the comments on this SO question, it seems like install.packages() might work as well.
It looks like there is another option from RStudio devs, renv. It's available on CRAN and supersedes Packrat.
In short, you use renv::init() to initialize your project library, and use renv::snapshot() / renv::restore() to save and load the state of your library.
I prefer this option to conda r-enviroments because here everything is stored in the file renv.lock, which can be committed to a Git repo and distributed to the team.
To add to this:
Note:
1. Have Anaconda installed already
2. Assumed your working directory is "C:"
To create desired environment -> "r_environment_name"
C:\>conda create -n "r_environment_name" r-essentials r-base
To see available environments
C:\>conda info --envs
.
..
...
To activate environment
C:\>conda activate "r_environment_name"
(r_environment_name) C:\>
Launch Jupyter Notebook and let the party begins
(r_environment_name) C:\> jupyter notebook
For a similar "requirements.txt", perhaps this link will help -> Is there something like requirements.txt for R?
Check out roveR, the R container management solution. For details, see https://www.slideshare.net/DavidKunFF/ownr-technical-introduction, in particular slide 12.
To install roveR, execute the following command in R:
install.packages("rover", repos = c("https://lair.functionalfinances.com/repos/shared", "https://lair.functionalfinances.com/repos/cran"))
To make full use of the power of roveR (including installing specific versions of packages for reproducibility), you will need access to a laiR - for CRAN, you can use our laiR instance at https://lair.ownr.io, for uploading your own packages and sharing them with your organization you will need a laiR license. You can contact us on the email address in the presentation linked above.