R code can get run in various ways, such as being called via source, loaded from a package, or read in from stdin. I'd like to detect this in order to create files that can work in a multitude of contexts.
Current experimental detector script is here: https://gitlab.com/-/snippets/2268211
Some of the tests are a bit heuristic based on observation and not documentation. For example I'm not sure which of the two tests for running under littler are better:
if(Sys.getenv("R_PACKAGE_NAME") == "littler"){
message("++ R_PACKAGE_NAME suggests running under littler")
mode_found <- TRUE
}
if(Sys.getenv("R_INSTALL_PKG") == "littler"){
message("++ R_INSTALL_PKG suggests running under littler")
mode_found <- TRUE
}
and the test for being loaded from a package is simply seeing if the current environment is a namespace:
if(isNamespace(environment())){
message("++ Being loaded by a package")
mode_found <- TRUE
}
which seems to be true during package load but I suppose could be true in other contexts, for example reading with source with a local argument that is a namespace.
In the end I suspect most of these cases won't matter to my application too much, but it might be useful to someone to have a set - as complete as possible - of detection tests.
So, are the tests in my detector script okay, and how could they be improved?
Related
I am writing unit tests for a package using testthat.
The package maintains a cache of database connections, which are configured with YAML files. The cache is populated during loading by searching for config files in specific paths, including rappdirs::site_config_dir() and rappdirs::user_config_dir().
This behaviour is desired in normal usage but during testing it means the tests will not be reproducible on different machines and over time.
My current approach is to create a configure/don't configure switch based on a global option in .onLoad():
no_connect <- options("SQLHELPER_NO_CONNECT_ON_LOAD")
if(is.null(no_connect[[1]])){
create_connections()
} else if(no_connect[[1]] == FALSE) {
create_connections()
}
which works interactively; my question is:
How can I make testthat set the option before it loads the package?
(I have tried fiddling with tests/testthat.R, but it is not recommended and only used in R CMD check anyway. I'm hoping for something that also works with devtools::test() etc)
UPDATE I thought briefly that I could set the option in tests/testthat/setup.R, but I think I must have got confused after playing with setting and unsetting it for an afternoon. For anyone reading this in search of an answer, setup.R is run by testthat after .onLoad()is run by loading the package.
I am using testthat to check the code in my package. Some of my tests are for basic functionality, such as constructors and getters. Others are for complex functionality that builds on top of the basic functionality. If the basic tests fail, then it is expected that the complex tests will fail, so there is no point testing further.
Is it possible to:
Ensure that the basic tests are always done first
Make a test-failure halt the testing process
To answer your question, I don't think it can be determined other than by having appropriate alphanumeric naming of your test-*.R files.
From testthat source, this is the function which test_package calls, via test_dir, to get the tests:
find_test_scripts <- function(path, filter = NULL, invert = FALSE, ...) {
files <- dir(path, "^test.*\\.[rR]$", full.names = TRUE)
What is wrong with just letting the complex task fail first, anyway?
A recent(ish) development to testthat is parallel test processing
This might not be suitable in your case as it sounds like you might have complex interdependencies between tests. If you could isolate your tests (I think that would be the more usual case anyway) then parallel processing is a great solution as it should speed up overall processing time as well as probably showing you the 'quick' tests failing before the 'slow' tests have completed.
Regarding tests order, you can use testthat config in the DESCRIPTION file to determine test order (by file) - As the documentation suggests
By default testthat starts the test files in alphabetical order. If you have a few number of test files that take longer than the rest, then this might not be the best order. Ideally the slow files would start first, as the whole test suite will take at least as much time as its slowest test file. You can change the order with the Config/testthat/start-first option in DESCRIPTION. For example testthat currently has:
Config/testthat/start-first: watcher, parallel*
Docs
It there an easy way to skip the execution of some tests in a package if the package is tested by CRAN? The background is, I like to have a lot of tests and in sum they are time consuming (not good for CRAN).
I know there is testthat::skip_on_cran() but I do not want to use package testthat to avoid another dependency. I am looking for an el-cheapo way to mimic testthat::skip_on_cran.
Ideally, I would like to have a testfile in directory pkg/tests that calls the tests (testfiles) and distuingishes if we are on cran or not:
if (!on_cran) {
## these tests are run only locally/when not on CRAN
# run test 1 (e.g. source a file with a test)
# run test 2
}
# run test 3 - always
Yes! You can handle this programmatically and automatically. Let me detail two ways I've set up:
Implicitly via version numbers: This is the approach taken by Rcpp for many years now, and it is entirely generic and not dependent on any other package. Our tests start from a file in tests/ and then hand over to RUnit, but that last part is an implementation detail.
In the main file tests/doRUnit.R we do this:
## force tests to be executed if in dev release which we define as
## having a sub-release, eg 0.9.15.5 is one whereas 0.9.16 is not
if (length(strsplit(packageDescription("Rcpp")$Version, "\\.")[[1]]) > 3) {
Sys.setenv("RunAllRcppTests"="yes")
}
In essence, we test if the version is of the form a.b.c.d -- and if so conclude that it is a development version. This implies "run all tests". Whereas a release version of the form a.b.c would go to CRAN, and not run these tests as they would exceed their time limit.
In each of the actual unit test files, we can then decide if we want to honour the variable and skip the test if set, or execute anyway:
.runThisTest <- Sys.getenv("RunAllRcppTests") == "yes"
if (.runThisTest) {
## code here that contains the tests
}
This mechanism is fully automatic, and does not depend on the user. (In the actual package version there is another if () test wrapped in there which allows us to suppress the tests, but that is a detail we do not need here).
I still like this approach a lot.
Explicitly via resource files Another package a few of us work on (a lot lately) requires a particular backend to be available. So in the Rblpapi package we tests for presence of a file which my coauthors and I each have below our $HOME directory in order to set up credentials and connection details. If the file is missing --- as e.g. on Travis CI, or CRAN, or for other users, the tests are skipped.
We chose to use the resource file as R file; it sources it if found and thereby sets values for options(). That way we can control directly whether to launch tests or not.
## We need to source an extra parameter file to support a Bloomberg connection
## For live sessions, we use ~/.Rprofile but that file is not read by R CMD check
## The file basically just calls options() and sets options as needed for blpHost,
## blpPort, blpAutoConnect (to ensure blpConnect() is called on package load) and,
## as tested for below, blpUnitTests.
connectionParameterFile <- "~/.R/rblpapiOptions.R"
if (file.exists(connectionParameterFile)) source(connectionParameterFile)
## if an option is set, we run tests. otherwise we don't.
## recall that we DO need a working Bloomberg connection...
if (getOption("blpUnitTests", FALSE)) {
## ... more stuff here which sets things up
}
Similarly to the first use case we can now set more variables which are later tested.
Explicitly via Travis CI Another option we use in rfoaas is to set the environment variable governing this in the Travis CI file:
env:
global:
- RunFOAASTests=yes
which the tests script then picks up:
## Use the Travis / GitHub integrations as we set this
## environment variable to "yes" in .travis.yml
##
## Set this variable manually if you want to run the tests
##
if (Sys.getenv("RunFOAASTests=yes") == "yes") runTests <- TRUE
In that case I also set the toggle based on my userid as I am pretty much the sole contributor to that project:
## Also run the tests when building on Dirk's box, even whem
## the environment variable is not set
if (isTRUE(unname(Sys.info()["user"])=="edd")) runTests <- TRUE
Explicitly via another variable You can of course also rely on another variable you use across all your packages. I find that to be a bad idea. If you set this in your shell, work on package A and set it to suppress tests but then switch to package B --- you will likely forget to unset the variable and then fail to test. I like this approach the least and do not use it.
Use an environment variable, like testthat does:
skip_on_cran <- function() {
if (identical(Sys.getenv("NOT_CRAN"), "true")) {
return(invisible(TRUE))
}
skip("On CRAN")
}
What is the proper way to skip all tests in the test directory of an R package when using testthat/devtools infrastructure? For example, if there is no connection to a database and all the tests rely on that connection, do I need to write a skip in all the files individually or can I write a single skip somewhere?
I have a standard package setup that looks like
mypackage/
... # other package stuff
tests/
testthat.R
testthat/
test-thing1.R
test-thing2.R
At first I thought I could put a test in the testthat.R file like
## in testthat.R
library(testthat)
library(mypackage)
fail_test <- function() FALSE
if (fail_test()) test_check("package")
but, that didn't work and it looks like calling devtools::test() just ignores that file. I guess an alternative would be to store all the tests in another directory, but is there a better solution?
The Skipping a test section in the R Packages book covers this use case. Essentially, you write a custom function that checks whatever condition you need to check -- whether or not you can connect to your database -- and then call that function from all tests that require that condition to be satisfied.
Example, parroted from the book:
skip_if_no_db <- function() {
if (db_conn()) {
skip("API not available")
}
}
test_that("foo api returns bar when given baz", {
skip_if_no_db()
...
})
I've found this approach more useful than a single switch to toggle off all tests since I tend to have a mix of test that do and don't rely on whatever condition I'm checking and I want to always run as many tests as possible.
Maybe you may organize tests in subdirectories, putting conditional directory inclusion in a parent folder test:
Consider 'tests' in testthat package.
In particular, this one looks interesting:
test-test_dir.r that includes 'test_dir' subdirectory
I do not see here nothing that recurses subdirectories in test scan:
test_dir
I am looking for best practice help with the brilliant testthat. Where is the best place to put your library(xyzpackage) calls to use all the package functionality?
I first have been setting up a runtest.R setting up paths and packages.
I then run test_files(test-code.R) which only holds context and tests. Example for structure follows:
# runtests.R
library(testthat)
library(data.table)
source("code/plotdata-fun.R")
mytestreport <- test_file("code/test-plotdata.R")
# does other stuff like append date and log test report (not shown)
and in my test files e.g. test-plotdata.R (a stripped down version):
#my tests for plotdata
context("Plot Chart")
test_that("Inputs valid", {
test.dt = data.table(px = c(1,2,3), py = c(1,4,9))
test.notdt <- c(1,2,3)
# illustrating the check for data table as first param in my code
expect_error(PlotMyStandardChart(test.notdt,
plot.me = FALSE),
"Argument must be data table/frame")
# then do other tests with test.dt etc...
})
Is this the way #hadley intended it to be used? It is not clear from the journal article. Should I also be duplicating library calls in my test files as well? Do you need a library set up in each context, or just one at start of file?
Is it okay to over-call library(package) in r?
To use the test_dir() and other functionality what is best way to set up your files. I do use require() in my functions, but I also set up test data examples in the contexts. (In above example, you will see I would need data.table package for test.dt to use in other tests).
Thanks for your help.
Some suggestions / comments:
set up each file so that it can be run on its own with test_file without additional setup. This way you can easily run an individual file while developing if you are just focusing on one small part of a larger project (useful if running all your tests is slow)
there is little penalty to calling library multiple times as that function first checks to see if the package is already attached
if you set up each file so that it can be run with test_file, then test_dir will work fine without having to do anything additional
you don't need to library(testthat) in any of your test files since presumably you are running them with test_file or test_dir, which would require testthat to be loaded
Also, you can just look at one of Hadley's recent packages to see how he does it (e.g. dplyr tests).
If you use devtools::test(filter="mytestfile") then devtools will take care of calling helper files etc. for you. In this way you don't need to do anything special to make your test file work on its own. Basically, this is just the same as if you ran a test, but only had test_mytestfile.R in your testthat folder.