Where to place library calls with testthat? - r

I am looking for best practice help with the brilliant testthat. Where is the best place to put your library(xyzpackage) calls to use all the package functionality?
I first have been setting up a runtest.R setting up paths and packages.
I then run test_files(test-code.R) which only holds context and tests. Example for structure follows:
# runtests.R
library(testthat)
library(data.table)
source("code/plotdata-fun.R")
mytestreport <- test_file("code/test-plotdata.R")
# does other stuff like append date and log test report (not shown)
and in my test files e.g. test-plotdata.R (a stripped down version):
#my tests for plotdata
context("Plot Chart")
test_that("Inputs valid", {
test.dt = data.table(px = c(1,2,3), py = c(1,4,9))
test.notdt <- c(1,2,3)
# illustrating the check for data table as first param in my code
expect_error(PlotMyStandardChart(test.notdt,
plot.me = FALSE),
"Argument must be data table/frame")
# then do other tests with test.dt etc...
})
Is this the way #hadley intended it to be used? It is not clear from the journal article. Should I also be duplicating library calls in my test files as well? Do you need a library set up in each context, or just one at start of file?
Is it okay to over-call library(package) in r?
To use the test_dir() and other functionality what is best way to set up your files. I do use require() in my functions, but I also set up test data examples in the contexts. (In above example, you will see I would need data.table package for test.dt to use in other tests).
Thanks for your help.

Some suggestions / comments:
set up each file so that it can be run on its own with test_file without additional setup. This way you can easily run an individual file while developing if you are just focusing on one small part of a larger project (useful if running all your tests is slow)
there is little penalty to calling library multiple times as that function first checks to see if the package is already attached
if you set up each file so that it can be run with test_file, then test_dir will work fine without having to do anything additional
you don't need to library(testthat) in any of your test files since presumably you are running them with test_file or test_dir, which would require testthat to be loaded
Also, you can just look at one of Hadley's recent packages to see how he does it (e.g. dplyr tests).

If you use devtools::test(filter="mytestfile") then devtools will take care of calling helper files etc. for you. In this way you don't need to do anything special to make your test file work on its own. Basically, this is just the same as if you ran a test, but only had test_mytestfile.R in your testthat folder.

Related

Determining if there are unused packages in an R script [duplicate]

As my code evolves from version to version, I'm aware that there are some packages for which I've found better/more appropriate packages for the task at hand or whose purpose was limited to a section of code which I've now phased out.
Is there any easy way to tell which of the loaded packages are actually used in a given script? My header is beginning to get cluttered.
Update 2020-04-13
I've now updated the referenced function to use the abstract syntax tree (AST) instead of using regular expressions as before. This is a much more robust way of approaching the problem (it's still not completely ironclad). This is available from version 0.2.0 of funchir, now on CRAN.
I've just got around to writing a quick-and-dirty function to handle this which I call stale_package_check, and I've added it to my package (funchir).
e.g., if we save the following script as test.R:
library(data.table)
library(iotools)
DT = data.table(a = 1:3)
Then (from the directory with that script) run funchir::stale_package_check('test.R'), we'll get:
Functions matched from package data.table: data.table
**No exported functions matched from iotools**
Have you considered using packrat?
packrat::clean() would remove unused packages, for example.
I've written a command-line script to accomplish this task. You can find it in this Github gist. I'm sure there are edge cases that it misses, but it works pretty well, on both R scripts and Rmd files.
My approach always is to close my R script or IDE (i.e. RStudio) and then start it again.
After this I run my function without loading any dependecies/packages beforehand.
This should result in various warning and error messages telling you which functions couldn't be found and executed. This again will give you hints on what packages are necessary to load beforehand and which one you can leave out.

how to use utils::globalVariables

Following your recommendations (or trying to do it, at least), I have tried some options, but the problem remains, so there must be something I am missing.
I have included a more complete code
setwd("C:/naapp")
#' #import utils
#' #import devtools
I have tried with and without using suppressForeignCheck
if(getRversion() >= "2.15.1"){
utils::globalVariables(c("eleven"))
utils::suppressForeignCheck(c("eleven"))
}
myFunctionSum <- function(X){print(X+eleven)}
myFunctionMul <- function(X){print(X*eleven)}
myFunction11 <- function(X){
assign("eleven",11,envir=environment(myFunctionMul))
}
maybe I should use a particular environment?
package.skeleton(name = "myPack11", list=ls(),
path = "C:/naapp", force = TRUE,
code_files = character())
I remove the "man" directory from the directory myPack11,
otherwise I would get an error because the help files are empty.
I add the imports utils, and devtools to the descrption
Then I run check
devtools::check("myPack11")
And I still get this note
#checking R code for possible problems ... NOTE
#myFunctionMul: no visible binding for global variable 'eleven'
#myFunctionSum: no visible binding for global variable 'eleven'
#Undefined global functions or variables:eleven
I have tried also to make an enviroment, combining Tomas Kalibera's suggetion and an example I found in the Internet.
myEnvir <- new.env()
myEnvir$eleven <- 11
etc
In this case, I get the same note, but with "myEnvir", instead of "eleven"
First version of the question
I trying to use "globalVariables" from the package utils. I am building an interface in R and I am planning to submit to CRAN. This is my first time, so, sorry if the question is very basic.I have read the help and I have tried to find examples, but I still don't know how to use it.
I have made a little silly example to ilustrate my question, which is:
Where do I have to place this line exactly?:
if(getRversion() >= "2.15.1"){utils::globalVariables("eleven")}
My example has three functions. myFunction11 creates the global variable "eleven" and the other two functions manipulate it. In my real code, I cannot use arguments in the functions that are called by means of a button. Consider that this is just a silly example to learn how to use globalVariables (to avoid binding notes).
myFunction11 <- function(){
assign("eleven",11,envir=environment(myFunctionSum))
}
myFunctionSum <- function(X){
print(X+eleven)
}
myFunctionMul <- function(X){
print(X*eleven)
}
Thank you in advance
I thought that the file globals.R would be automatically generated when using globalsVariables. The problem was that I needed to create the package skeleton, then create the file globals.R, add it to the R directory in the package and check the package.
So, I needed to place this in a different file:
#' #import utils
utils::globalVariables(c("eleven"))
and save it
The documentation clearly says:
## In the same source file (to remind you that you did it) add:
if(getRversion() >= "2.15.1") utils::globalVariables(c(".obj1", "obj2"))
so put it in the same source file as your functions. It can go in any of your R source files, but the comment above recommends you put it close to your code. Looking at a bunch of github packages reveals another common pattern is to have a globals.R function with it in, but this is probably a bad idea. If you later remove the global from your package but neglect to update globals.R you could mask a problem. Putting it right close to the functions that use it will hopefully remind you when you edit those functions.
Make sure you put it outside any function definitions in the file, or it won't get seen.
You cannot modify bindings in a package namespace once the package is loaded (and namespace sealed, and bindings locked). The check tool helps you to spot violations of this restriction, so you find out about the problem when checking the package rather than while running it. globalVariables is just a call to silence check when looking for these violations, which is undesirable in almost all cases. If you really need mutable state in a package, you can create a new environment (using new.env) and bind it to an (unexported) "global" variable in your namespace. This binding will be locked, but this is ok, because in R you can change an environment in place (add/remove elements, effectively modifying the elements).
The best situation is however when you can keep all mutable state in user objects (passed in as arguments into functions, and their modified versions returned as output values of functions).

Make tests run in specific order [duplicate]

I am using testthat to check the code in my package. Some of my tests are for basic functionality, such as constructors and getters. Others are for complex functionality that builds on top of the basic functionality. If the basic tests fail, then it is expected that the complex tests will fail, so there is no point testing further.
Is it possible to:
Ensure that the basic tests are always done first
Make a test-failure halt the testing process
To answer your question, I don't think it can be determined other than by having appropriate alphanumeric naming of your test-*.R files.
From testthat source, this is the function which test_package calls, via test_dir, to get the tests:
find_test_scripts <- function(path, filter = NULL, invert = FALSE, ...) {
files <- dir(path, "^test.*\\.[rR]$", full.names = TRUE)
What is wrong with just letting the complex task fail first, anyway?
A recent(ish) development to testthat is parallel test processing
This might not be suitable in your case as it sounds like you might have complex interdependencies between tests. If you could isolate your tests (I think that would be the more usual case anyway) then parallel processing is a great solution as it should speed up overall processing time as well as probably showing you the 'quick' tests failing before the 'slow' tests have completed.
Regarding tests order, you can use testthat config in the DESCRIPTION file to determine test order (by file) - As the documentation suggests
By default testthat starts the test files in alphabetical order. If you have a few number of test files that take longer than the rest, then this might not be the best order. Ideally the slow files would start first, as the whole test suite will take at least as much time as its slowest test file. You can change the order with the Config/testthat/start-first option in DESCRIPTION. For example testthat currently has:
Config/testthat/start-first: watcher, parallel*
Docs

Is it possible to determine test order in testthat?

I am using testthat to check the code in my package. Some of my tests are for basic functionality, such as constructors and getters. Others are for complex functionality that builds on top of the basic functionality. If the basic tests fail, then it is expected that the complex tests will fail, so there is no point testing further.
Is it possible to:
Ensure that the basic tests are always done first
Make a test-failure halt the testing process
To answer your question, I don't think it can be determined other than by having appropriate alphanumeric naming of your test-*.R files.
From testthat source, this is the function which test_package calls, via test_dir, to get the tests:
find_test_scripts <- function(path, filter = NULL, invert = FALSE, ...) {
files <- dir(path, "^test.*\\.[rR]$", full.names = TRUE)
What is wrong with just letting the complex task fail first, anyway?
A recent(ish) development to testthat is parallel test processing
This might not be suitable in your case as it sounds like you might have complex interdependencies between tests. If you could isolate your tests (I think that would be the more usual case anyway) then parallel processing is a great solution as it should speed up overall processing time as well as probably showing you the 'quick' tests failing before the 'slow' tests have completed.
Regarding tests order, you can use testthat config in the DESCRIPTION file to determine test order (by file) - As the documentation suggests
By default testthat starts the test files in alphabetical order. If you have a few number of test files that take longer than the rest, then this might not be the best order. Ideally the slow files would start first, as the whole test suite will take at least as much time as its slowest test file. You can change the order with the Config/testthat/start-first option in DESCRIPTION. For example testthat currently has:
Config/testthat/start-first: watcher, parallel*
Docs

How can I tell which packages I am not using in my R script?

As my code evolves from version to version, I'm aware that there are some packages for which I've found better/more appropriate packages for the task at hand or whose purpose was limited to a section of code which I've now phased out.
Is there any easy way to tell which of the loaded packages are actually used in a given script? My header is beginning to get cluttered.
Update 2020-04-13
I've now updated the referenced function to use the abstract syntax tree (AST) instead of using regular expressions as before. This is a much more robust way of approaching the problem (it's still not completely ironclad). This is available from version 0.2.0 of funchir, now on CRAN.
I've just got around to writing a quick-and-dirty function to handle this which I call stale_package_check, and I've added it to my package (funchir).
e.g., if we save the following script as test.R:
library(data.table)
library(iotools)
DT = data.table(a = 1:3)
Then (from the directory with that script) run funchir::stale_package_check('test.R'), we'll get:
Functions matched from package data.table: data.table
**No exported functions matched from iotools**
Have you considered using packrat?
packrat::clean() would remove unused packages, for example.
I've written a command-line script to accomplish this task. You can find it in this Github gist. I'm sure there are edge cases that it misses, but it works pretty well, on both R scripts and Rmd files.
My approach always is to close my R script or IDE (i.e. RStudio) and then start it again.
After this I run my function without loading any dependecies/packages beforehand.
This should result in various warning and error messages telling you which functions couldn't be found and executed. This again will give you hints on what packages are necessary to load beforehand and which one you can leave out.

Resources