Run unit tests with testthat without package - r

I have a shiny application which uses like 4 functions. I would like to test these functions but it's not a package. How am i supposed to structure my code ? and execute these tests without devtools ?

You can execute tests with testthat::test_dir() or testthat::test_file(). Neither relies on the code being in a package, or using devtools, just the testthat package.
There are few requirements on how to structure your code.
If it were me, I would create a tests directory and add my test scripts under there, which would look something like:
|- my_shiny_app
| |- app.R
| |- tests
| |- test_foo.R
| |- test_bar.R
Then you can run your tests with test_dir('tests'), assuming you're in the my_shiny_app directory.
Your test scripts will have they same structure they have for packages but you'd replace the library() call with source() referencing the file where your functions are defined.

If you have few functions without a package structure, it is better to write single test files manually (so with some simple if/error catching system) that you call with Rscript test_file1.R.
If you start to use the package format instead (which would be advisable for further 'safe' developing) and you still do not want to use testthat, I advise you to follow this blog post: here

Related

R: renv within R notebook-scoped (Rmd) workflows

I am looking for a way to make my R notebook-centric workflow be more reproducible and subsequently more easily containerized with Docker. For my medium-sized data analysis projects, I work with a very simple structure: a folder with an associated with an .Rproj and an index.html (that is a landing page for Github Pages) that holds other folders that have within them the notebooks, data, scripts, etc. This simple "1 GitHub repo = 1 Rproj" structure was also good for my nb.html files rendered by Github Pages.
.
└── notebooks_project
├── notebook_1
│ ├── notebook_1.Rmd
│ └── ...
├── notebook_2
│ ├── notebook_2.Rmd
│ └── ...
├── notebooks_project.Rproj
├── README.md
├── index.html
└── .gitignore
I wish to keep this workflow that utilizes R notebooks both as literate programming tools and control documents (see RMarkdown Driven Development), as it seems decently suited for medium reproducible analytic projects. Unfortunately, there is a lack of documentation about Rmd-centric workflows using renv, although it seems to be well integrated with it.
Frist, Yihui Xie hinted here that methods related to using renv for individual Rmd documents include: renv::activate(), renv::use(), and renv::embed(). The renv::activate() does ony a part of what renv::init() does: it loads the project and sources the init.R. From my understanding, it does this if a project was already initialized, but it acts like renv::init() if project was not initialized: discovers dependencies, copies them to renv global package cache, writes several files (.Rprofile, renv/activate.R, renv/.gitignore, .Rbuildignore). renv::use() works well within standalone R scripts where the script's dependencies are specified directly within that script and we need those packages automatically installed and loaded when the associated script is run. renv::embed() just embeds a compact representation of renv.lock into a code chunk of the notebook - it changes the .Rmd on render/save by adding the code chunk with dependencies and deletes the call to renv::embed(). As I understand it, using renv::embed() and renv::use() could be sufficient for a reproducible stand-alone notebook. Nevertheless, I don't mind having the lock file in the directory or keeping the renv library as long as they are all in the same directory.
Second, preparing for subsequent Binder or Docker requirements, using renv together with RStudio Package Manager. Grant McDermott provides some useful code here (that may go in the .Rprofile or in the .Rmd itself, I think) and provides the rationale for it:
The lockfile is references against RSPM as the default package
repository (i.e. where to download packages from), rather than one of
the usual CRAN mirrors. Among other things, this enables
time-travelling across different package versions and fast
installation of pre-compiled R package binaries on Linux.
Third, I'd like to use the here package to work with relative paths. It seems the way to go so that the notebooks can run when transferred or when running inside Docker container. Unfortunately, here::here() looks for the .Rproj and will find it in my upper level folder (i.e. notebooks_project). A .here file that may be placed with here::set_here() overrides this behavior making here::here() point to the notebook folder as intended (i.e. notebook1). Unfortunately, the .here file takes effect only on restarting the R session or running unloadNamespace("here") (documented here).
Here is what I have experimented with untill now:
---
title: "<br> R Notebook Template"
subtitle: "RMardown Report"
author: "<br> Claudiu Papasteri"
date: "`r format(Sys.time(), '%d %m %Y')`"
output:
html_notebook:
code_folding: hide
toc: true
toc_depth: 2
number_sections: true
theme: spacelab
highlight: tango
font-family: Arial
---
```{r setup, include = FALSE}
# Set renv activate the current project
renv::activate()
# Set default package source by operating system, so that we automatically pull in pre-built binary snapshots, rather than building from source.
# This can also be appended to .Rprofile
if (Sys.info()[["sysname"]] %in% c("Linux", "Windows")) { # For Linux and Windows use RStudio Package Manager (RSPM)
options(repos = c(RSPM = "https://packagemanager.rstudio.com/all/latest"))
} else {
# For Mac users, we default to installing from CRAN/MRAN instead, since RSPM does not yet support Mac binaries.
options(repos = c(CRAN = "https://cran.rstudio.com/"))
# options(renv.config.mran.enabled = TRUE) ## TRUE by default
}
options(renv.config.repos.override = getOption("repos"))
# Install (if necessary) & Load packages
packages <- c(
"tidyverse", "here"
)
renv::install(packages, prompt = FALSE) # install packages that are not in cache
renv::hydrate(update = FALSE) # install any packages used in the Rnotebook but not provided, do not update
renv::snapshot(prompt = FALSE)
# Set here to Rnotebook directory
here::set_here()
unloadNamespace("here") # need new R session or unload namespace for .here file to take precedence over .Rproj
rrRn_name <- fs::path_file(here::here())
# Set kintr options including root.dir pointing to the .here file in Rnotebook directory
knitr::opts_chunk$set(root.dir = here::here())
# ???
renv::use(lockfile = here::here("renv.lock"), attach = TRUE) # automatic provision an R library when Rnotebook is run and load packages
# renv::embed(path = here::here(rrRn_name), lockfile = here::here("renv.lock")) # if run this embeds the renv.lock inside the Rnotebook
renv::status()$synchronized
```
I'd like my nobooks to be able to run without code change both locally (where dependencies are already installed, cached and where the project was initialized) and when transferred to other systems. Each notebook should have its own renv settings.
I have many questions:
What's wrong with my renv sequence? Is calling renv::activate() on every run (both for initialization and after) the way to go? Should I use renv::use() instead of renv::install() and renv::hydrate()? Is renv::embed() better for a reproducible workflow even though every notebook folder should have its renv.lock and library? renv on activation also creates an .Rproj file (e.g. notebook1.Rproj) thus breaking my simple 1 repo = 1 Rproj - should this concern me?
The renv-RSPM workflow seems great, but is there any advantage of storing that script in the .Rprofile as opposed to having it within the Rmd itself?
Is ther a better way to use here? That unloadNamespace("here") seems hacky but it seems the only way to preserve a use for the .here files.
What's wrong with my renv sequence? Is calling renv::activate() on every run (both for initialization and after) the way to go? Should I use renv::use() instead of renv::install() and renv::hydrate()? Is renv::embed() better for a reproducible workflow even though every notebook folder should have its renv.lock and library?
If you already have a lockfile that you want to use + associate with your projects, then I would recommend just calling renv::restore(lockfile = "/path/to/lockfile"), rather than using renv::use() or renv::embed(). Those tools are specifically for the case where you don't want to use an external lockfile; that is, you'd rather embed your document's dependencies in the document itself.
The question about renv::restore() vs renv::install() comes down to whether you want the exact package versions as encoded in the lockfile, or whatever happens to be current / latest on the R package repositories visible to your session. I think the most typical workflow is something like:
Use renv::install(), renv::hydrate(), or other tools to install packages as you require them;
Confirm that your document is in a good, runnable state,
Call renv::snapshot() to "save" that state,
Use renv::restore() in future runs of your document to "load" that previously-saved state.
renv on activation also creates an .Rproj file (e.g. notebook1.Rproj) thus breaking my simple 1 repo = 1 Rproj - should this concern me?
If this is undesired behavior, you might want to file a bug report at https://github.com/rstudio/renv/issues, with a bit more context.
The renv-RSPM workflow seems great, but is there any advantage of storing that script in the .Rprofile as opposed to having it within the Rmd itself?
It just depends on how visible you want that configuration to be. Do you want it to be active for all R sessions launched in that project directory? If so, then it might belong in the .Rprofile. Do you only want it active for that particular R Markdown document? If so, it might be worth including there. (Bundling it in the R Markdown file also makes it easier to share, since you could then share just the R Markdown document without also needing to share the project / .Rprofile)
Is ther a better way to use here? That unloadNamespace("here") seems hacky but it seems the only way to preserve a use for the .here files.
If I understand correctly, you could just manually create a .here file yourself before loading the here package, e.g.
file.create("/path/to/.here")
library(here)
since that's all set_here() really does.

How do I use setwd in a relative way?

Our team uses R scripts in git repos that are shared between several people, across both Mac and Windows (and occasionally Linux) machines. This tends to lead to a bunch of really annoying lines at the top of scripts that look like this:
#path <- 'C:/data-work/project-a/data'
#path <- 'D:/my-stuff/project-a/data'
path = "~/projects/project-a/data"
#path = 'N:/work-projects/project-a/data'
#path <- "/work/project-a/data"
setwd(path)
To run the script, we have to comment/uncomment the correct path variable or the scripts won't run. This is annoying, untidy, and tends to be a bit of a mess in the commit history too.
In past I've got round this by using shell scripts to set directories relative to the script's location and skipping setwd entirely (and then using ./run-scripts.sh instead of Rscript process.R), but as we've got Windows users here, that won't work. Is there a better way to simplify these messy setwd() boilerplates in R?
(side note: in Python, I solve this by using the path library to get the location of the script file itself, and then build relative paths from that. But R doesn't seem to have a way to get the location of the running script's file?)
The answer is to not use setwd() at all, ever. R does things a bit different than Python, for sure, but this is one thing they have in common.
Instead, any scripts you're executing should assume they're being run from a common, top-level, root folder. When you launch a new R process, its working directory (i.e., what getwd() gives) is set to the same folder as the process was spawned from.
As an example, if you had this layout:
.
├── data
│   └── mydata.csv
└── scripts
└── analysis.R
You would run analysis.R from . and analysis.R would reference data/mydata.csv as "data/mydata.csv" (e.g., read.csv("data/mydata.csv, stringsAsFactors = FALSE)).
I would keep your shell scripts or Makefiles that run your R scripts and have the R scripts assume they're being run from the top level of the git repo.
This might look like:
cd . # Whereever `.` above is
Rscript scripts/analysis.R
Further reading:
https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
https://github.com/jennybc/here_here
1) If you are looking for a way to find the path of the currently running script then see:
Rscript: Determine path of the executing script
2) Another approach is to require that users put an option of a prearranged name in their .Rprofile file. Then the script can setwd to that. An attractive aspect of this system is that over time one can forget where various projects are located and with this system one can just look at the .Rprofile file to remind oneself. For example, for projectA each person running the project would put this in their .Rprofile
options(projectA = "...whatever...")
and then the script would start off with:
proj <- getOption("projectA")
if (!is.null(proj)) setwd(proj) else stop("Set option 'projectA' to its directory")
One variation of this is to assume the current directory if projectA is not defined. Although this may seem to be more flexible I personally find the documenting feature of the above code to be a big advantage.
proj <- getOption("projectA")
if (!is.null(proj)) setwd(proj) else cat("Using", getwd(), "\n")
in Python, I solve this by using the path library to get the location of the script file itself, and then build relative paths from that. But R doesn't seem to have a way to get the location of the running script's file?
R itself unfortunately doesn’t have a way for this. But you can achieve the same result in either of two ways:
Use packages instead of scripts where you include code via source. Then you can use the solution outlined in amoeba’s answer. This works because the real issue is that R has no way of telling the source function where to look for scripts.
Use box::use instead of source. The ‘box’ package provides a module system that allows relative imports of code modules. A nice side-effect of this is that the package provides a function that tells you the path of the current script, just like in Python (and, just like in Python, you normally don’t need to use this function directly).

How to organize R scripts that use functions stored in R/ directory of R package

I have the following package structure:
mypackage/
|-- .Rbuildignore
|-- .gitignore
|-- DESCRIPTION
|-- NAMESPACE
|-- inst
|-- extdata
|-- mydata.csv
|-- vignettes
|-- R
|-- utils.R
`-- mypackage.Rproj
Currently I stored all the functions in R/ directory. My question is
where should I put scripts (e.g. named try_functions.R) to try the functions stored in R/, that scripts. It also use data stored in inst/extdata/
And in development process using RStudio, what's the workflow like to update and try the package after we add and fixed functions in R/.
It sounds to me like testthat is the package you are looking for. By "try", I presume you mean "test," and the way that it is canonically done for the testthat package is within a tests/testthat directory for the package.
Hadley's "Advanced R" book has a good deal more information about best practices, and you can find many good examples by looking at github.
Some excerpts from the docs:
Testing is a vital part of package development. It ensures that your
code does what you want it to do. Testing, however, adds an additional
step to your development workflow. The goal of this chapter is to show
you how to make this task easier and more effective by doing formal
automated testing using the testthat package.
And implementing:
To set up your package to use testthat, run:
devtools::use_testthat()
This will:
Create a tests/testthat directory.
Adds testthat to the Suggests field in the DESCRIPTION.
Creates a file tests/testthat.R that runs all your tests when R CMD
check runs. (You’ll learn more about that in automated checking.)
You also might look at the rprojroot package for referencing various places within the directory of the package.
The canonical place for keeping arbitrary R scripts is inst/ subdirectory.
Note that tests of your package functionality it is better to put in tests/ subdirectory. Loose scripts that are not tests (at least test of your package) should be in placed inst/. Those can be test scripts for checking deployment environment, test for checking production data quality, exec scripts to be plugged in crontab, whatever is useful/necessary in putting your package into action.
Quoting Writing R Extensions manual "Package subdirectories":
The contents of the inst subdirectory will be copied recursively to the installation directory. Subdirectories of inst should not interfere with those used by R (currently, R, data, demo, exec, libs, man, help, html and Meta, and earlier versions used latex, R-ex). The copying of the inst happens after src is built so its Makefile can create files to be installed. To exclude files from being installed, one can specify a list of exclude patterns in file .Rinstignore in the top-level source directory. These patterns should be Perl-like regular expressions (see the help for regexp in R for the precise details), one per line, to be matched case-insensitively against the file and directory paths, e.g. doc/.*[.]png$ will exclude all PNG files in inst/doc based on the extension.

testthat .Rbuildignore + external file (NOTE)

Building a package using testthat for tests; those require an external file which as recommended lies in /tests/testthat/my-file.
However the R CMD check produces
Found the following hidden files and directories:
tests/testthat/my-file
The above is NOTE (Status: 1 NOTE)
If I add my-file to .Rbuildignore (devtools::use_build_ignore("/tests/testthat/my-file") then the file is well, ignored during the check, thus all tests fail and the package cannot be build.
How can I solve this issue? I understand that a NOTE is passable but I would like to get rid of it nonetheless.
The preferred way (according to Hadley) to load API credentials is via environment variables. If you are sharing the credentials with your package, you can just set them in an .onLoad function that will be run with the package namespace is loaded. If you just want to be able to run tests locally using those credentials but not share them, then add them to global Renviron.site file (or, less conveniently, in an .Renviron file in your working directory). Then you can delete this file from your package structure (or just .Rbuildignore it) and make the tests conditional on the presence of the environment variable, with something like:
if (!identical(Sys.getenv("MY_ENV_VAR"), "")) {
test_all("package")
}

R package development, Possible to create submaps within \R directory?

I'm trying to create a R package. Now I've used roxygen and devtools to help create all necessary files and it's working.
Among others I have the maps /man , /R, /tests. Now I would like to create some subfolders in /R directory, but once I do this and move any scripts inside I get an Error in namespaceExport(ns, exports) when trying to rebuild the package.
Can I only have script files directly within /R subdirectory, and is there any solution to this other than putting the script files in other maps one level up? (such as old scripts that one may use in the future)
Thanks

Resources