Difference between environment etc. in testthat::test_check vs testthat::test_dir - r

This is somewhat deep R testing question, and as such, I'm not sure if general Stack Overflow is the right place for it, or if there's an R specific forum that would be better.
Any pointers on that are welcome.
The scenario is: I have package that is using testthat and has some tests in tests/testthat and (for reasons that are important but, to be honest, I don't totally understand) there are some other tests in inst/validation that need to be run as well, as part of a validation script (i.e. the script that this post is about).
I was running test_check(pkg) in my tests folder and it was working fine, but I wasn't getting the extra tests (which makes sense). So then I switched to the following:
test_dirs <- c("tests/testthat", "inst/validation")
for (.t in test_dirs) {
test_dir(.t)
}
Now a bunch of my tests are failing because they can't find some of the constants, etc. that are part of my package! (see note at the bottom for more details...)
So I dig in to the source code and find that test_check() actually calls testthat:::test_package_dir under the hood. Note the ::: this is an unexported function, so I don't really just want to call it in my own code.
testthat:::test_package_dir in turn calls the following, before calling test_dir() itself:
env <- test_pkg_env(package)
withr::local_options(list(topLevelEnvironment = env))
withr::local_envvar(list(
TESTTHAT_PKG = package,
TESTTHAT_DIR = maybe_root_dir(test_path)
))
test_dir(...
Sooooo... it seems like test_check() essentially just does some things to load the package environment (note test_pkg_env is also unexported) and then calls test_dir().
So I guess my question is: why? I've actually noticed this before with test_file() not working because it doesn't have everything in the package environment. Why do these functions not load the package environment like the other testing functions do?
Or really, my question is: is there a way to make them load it? And specifically in my case, is there a way to do what I'm trying to do (run tests in a few different directories) and have it load the package environment?
I notice this in the test_dirs docs:
env -- Environment in which to execute the tests. Expert use only.
which is set to test_env() by default. I have a feeling this is my answer, but I can't figure out how to get the package environment without basically copy/pasting a bunch of code out of functions that are hidden in :::. Perhaps I don't qualify as an "expert"...
Thanks for any insight and/or solutions!
note at the bottom:
Specifically my issue is that I have some "constants" in my aaa.R that are mostly just hard-coded strings or lists like:
SUMMARY_NAME <- "summary"
SUMMARY_COUNT <- "sum_count"
SUMMARY_PATH <- "sum_path"
SUM_REQ_COLS <- list(
list(name = SUMMARY_NAME, type = "character"),
list(name = SUMMARY_COUNT, type = "numeric"),
list(name = SUMMARY_PATH, type = "character"),
)
These are things that I use for checking S3 classes and other purposes so that I don't have hard-coded strings all over my code. The point is: I use some of these in my tests, which works fine for test_check() and devtools::check() and devtools::test() but dies when I try to use test_dir() or test_file() because they can't be found, presumably because the package environment isn't loaded.

Related

using rstudioapi in devtools tests

I'm making a package which contains a function that calls rstudioapi::jobRunScript(), and I would like to to be able to write tests for this function that can be run normally by devtools::test(). The package is only intended for use during interactive RStudio sessions.
Here's a minimal reprex:
After calling usethis::create_package() to initialize my package, and then usethis::use_r("rstudio") to create R/rstudio.R, I put:
foo_rstudio <- function(...) {
script.file <- tempfile()
write("print('hello')", file = script.file)
rstudioapi::jobRunScript(
path = script.file,
name = "foo",
importEnv = FALSE,
exportEnv = "R_GlobalEnv"
)
}
I then call use_test() to make an accompanying test file, in which I put:
test_that("foo works", {
foo_rstudio()
})
I then run devtools::test() and get:
I think I understand the basic problem here: devtools runs a separate R session for the tests, and that session doesn't have access to RStudio. I see here that rstudioapi can work inside child R sessions, but seemingly only those "normally launched by RStudio."
I'd really like to use devtools to test my function as I develop it. I suppose I could modify my function to accept an argument passed from the test code which will simply run the job in the R session itself or in some other kind of child R process, instead of an RStudio job, but then I'm not actually testing the normal intended functionality, and if there's an issue which is specific to the rstudioapi::jobRunScript() call and which could occur during normal use, then my tests wouldn't be able to pick it up.
Is there a way to initialize an RStudio process from within a devtools::test() session, or some other solution here?

R generic dispatching to attached environment

I have a bunch of functions and I'm trying to keep my workspace clean by defining them in an environment and attaching the environment. Some of the functions are S3 generics, and they don't seem to play well with this approach.
A minimum example of what I'm experiencing requires 4 files:
testfun.R
ttt.xxx <- function(object) print("x")
ttt <- function(object) UseMethod("ttt")
ttt2 <- function() {
yyy <- structure(1, class="xxx")
ttt(yyy)
}
In testfun.R I define an S3 generic ttt and a method ttt.xxx, I also define a function ttt2 calling the generic.
testenv.R
test_env <- new.env(parent=globalenv())
source("testfun.R", local=test_env)
attach(test_env)
In testenv.R I source testfun.R to an environment, which I attach.
test1.R
source("testfun.R")
ttt2()
xxx <- structure(1, class="xxx")
ttt(xxx)
test1.R sources testfun.R to the global environment. Both ttt2 and a direct function call work.
test2.R
source("testenv.R")
ttt2()
xxx <- structure(1, class="xxx")
ttt(xxx)
test2.R uses the "attach" approach. ttt2 still works (and prints "x" to the console), but the direct function call fails:
Error in UseMethod("ttt") :
no applicable method for 'ttt' applied to an object of class "xxx"
however, calling ttt and ttt.xxx without arguments show that they are known, ls(pos=2) shows they are on the search path, and sloop::s3_dispatch(ttt(xxx)) tells me it should work.
This questions is related to Confusion about UseMethod search mechanism and the link therein https://blog.thatbuthow.com/how-r-searches-and-finds-stuff/, but I cannot get my head around what is going on: why is it not working and how can I get this to work.
I've tried both R Studio and R in the shell.
UPDATE:
Based on the answers below I changed my testenv.R to:
test_env <- new.env(parent=globalenv())
source("testfun.R", local=test_env)
attach(test_env)
if (is.null(.__S3MethodsTable__.))
.__S3MethodsTable__. <- new.env(parent = baseenv())
for (func in grep(".", ls(envir = test_env), fixed = TRUE, value = TRUE))
.__S3MethodsTable__.[[func]] <- test_env[[func]]
rm(test_env, func)
... and this works (I am only using "." as an S3 dispatching separator).
It’s a little-known fact that you must use .S3method() to define methods for S3 generics inside custom environments (outside of packages).1 The reason almost nobody knows this is because it is not necessary in the global environment; but it is necessary everywhere else since R version 3.6.
There’s virtually no documentation of this change, just a technical blog post by Kurt Hornik about some of the background. Note that the blog post says the change was made in R 3.5.0; however, the actual effect you are observing — that S3 methods are no longer searched in attached environments — only started happening with R 3.6.0; before that, it was somehow not active yet.
… except just using .S3method will not fix your code, since your calling environment is the global environment. I do not understand the precise reason why this doesn’t work, and I suspect it’s due to a subtle bug in R’s S3 method lookup. In fact, using getS3method('ttt', 'xxx') does work, even though that should have the same behaviour as actual S3 method lookup.
I have found that the only way to make this work is to add the following to testenv.R:
if (is.null(.__S3MethodsTable__.)) {
.__S3MethodsTable__. <- new.env(parent = baseenv())
}
.__S3MethodsTable__.$ttt.xxx <- ttt.xxx
… in other words: supply .GlobalEnv manually with an S3 methods lookup table. Unfortunately this relies on an undocumented S3 implementation detail that might theoretically change in the future.
Alternatively, it “just works” if you use ‘box’ modules instead of source. That is, you can replace the entirety of your testenv.R by the following:
box::use(./testfun[...])
This code treats testfun.R as a local module and loads it, attaching all exported names (via the attach declaration [...]).
1 (and inside packages you need to use the equivalent S3method namespace declaration, though if you’re using ‘roxygen2’ then that’s taken care of for you)
First of all, my advice would be: don't try to reinvent R packages. They solve all the problems you say you are trying to solve, and others as well.
Secondly, I'll try to explain what went wrong in test2.R. It calls ttt on an xxx object, and ttt.xxx is on the search list, but is not found.
The problem is how the search for ttt.xxx happens. The search doesn't look for ttt.xxx in the search list, it looks for it in the environment from which ttt was called, then in an object called .__S3MethodsTable__.. I think there are two reasons for this:
First, it's a lot faster. It only needs to look in one or two places, and the table can be updated whenever a package is attached or detached, a relatively rare operation.
Second, it's more reliable. Each package has its own methods table, because two packages can use the same name for generics that have nothing to do with each other, or can use the same class names that are unrelated. So package code needs to be able to count on finding its own definitions first.
Since your call to ttt() happens at the top level, that's where R looks first for ttt.xxx(), but it's not there. Then it looks in the global .__S3MethodsTable__. (which is actually in the base environment), and it's not there either. So it fails.
There is a workaround that will make your code work. If you run
.__S3MethodsTable__. <- list2env(list(ttt.xxx = ttt.xxx))
as the last line of testenv.R, then you'll create a methods table in the global environment. (Normally there isn't one there, because that's user space, and R doesn't like putting things there unless the user asks for it.)
R will find that methods table, and will find the ttt.xxx method that it defines. I wouldn't be surprised if this breaks some other aspect of S3 dispatch, so I don't recommend doing it, but give it a try if you insist on reinventing the package system.

Accessing the entire source code of a function that has "useMethod("packagefunction") in Rstudio?

I am trying to extend the functionality of a package and therefore trying to access the entire code behind one of the functions.
The package in question is RQuantLib and I am trying to see the entire code used in the function "DiscountCurve"
The result I get is simply:
function (params, tsQuotes, times = seq(0, 10, 0.1), legparams = list(dayCounter = "Thirty360",
fixFreq = "Annual", floatFreq = "Semiannual"))
{
UseMethod("DiscountCurve")
}
I have tried quite a few solutions as posted in this thread with no luck: How can I view the source code for a function?
UseMethod("DiscountCurve") does not tell me much. As far as I understand this is a translated package from C++.
I am rather new to coding so it is possible that I have just not implemented the correct solution in the thread above correctly.
Edit for more detial on methods used so far:
Results from methods:
> methods("DiscountCurve")
1 DiscountCurve.default*
When checking methods(Class="default") I get 184 results. Due to space I will post screenshots of the code: prnt.sc/tws98x
Further using getAnywhere: prnt.sc/tws9rq
If you do:
RQuantLib:::DiscountCurve.default
You will see the actual code that runs when the generic calls UseMethod("DiscountCurve") . However, you are likely to be disappointed, because essentially that function is a glorified type-checker which passes your parameters safely to another unexported function called discountCurveEngine, which looks like this:
RQuantLib:::discountCurveEngine
function (rparams, tslist, times, legParams)
{
.Call(`_RQuantLib_discountCurveEngine`, rparams, tslist,
times, legParams)
}
Which, you will see, is actually a thin wrapper for the C++ code that actually does the calculation. It is written in Rcpp-flavoured C++ and you can read the source code here. However, this in turn calls functions from another C++ library called Quantlib.
Depending on how keen you are, and how proficient you are in C++, you may find this enjoyably challenging or dishearteningly baffling, but at least you know where to find the source code.

Removing/de-registering a specific function from an R package

I may not be using the terminology correctly here so please forgive me...
I have a case of one package 'overwriting' a function with the same name loaded by another package and thus changing the behavior (breaking) of a function.
The specific case:
X <- data.frame ( y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100) )
library(CausalImpact)
a <- CausalImpact::CausalImpact( X, c(1,75), c(76, 100) ) # works
library(bfast) # imports quantmod which loads crappy version of as.zoo.data.frame
b <- CausalImpact::CausalImpact( X, c(1,75), c(76, 100) ) # Error
I know the error comes from two versions of the function as.zoo.data.frame.
The problematic version is imported by bfast from the package 'quantmod' (see https://github.com/joshuaulrich/quantmod/issues/168). Unfortunately their hotfix did not prevent this error. Super annoying.
I can hack around this specific problem, but I was wondering if there is a general way to like 'de-register' this function variant from the search path. Neither detach nor unloadNamespace remove the offending function (same behavior after). An explanation and similar problem is discussed here and here, but I wasn't able to find a general solution. For instance I'd rather just remove this function than clone and re-write CausalImpact to deal with this behavior.
From R 3.6.0 onwards, there is a new option called "conflicts.policy" to handle this within an established framework. For small issues like this, you can use the new arguments to library(). If you aren't yet to 3.6, the easiest solution might be to explicitly namespace CausalImpact when you need it, i.e. CausalImpact::CausalImpact. That's a mouthful, so you could do causal_impact <- CausalImpact::CausalImpact and use that alias.
# only attach select
library(dplyr, include.only = "select")
# exclude slice/arrange from being attached.
library(dplyr, exclude = c("slice", "arrange"))
library(bfast, exclude = "CausalImpact") should solve your problem.
Attach means that they are available for use without explicit prefixing with their package. In either of these cases, something like dplyr::slice would work just fine.
For more information, you can see ?library. Also, the R-Core member Luke Tierney wrote a blog explaining how the conflicts.policy works. You can find that here
Here's an answer that works, but is less preferable than de-registering a S3 method because it involves replacing the registered version in the S3 Methods table with the desired method:
library(CausalImpact)
library(bfast)
assignInNamespace("as.zoo.data.frame", zoo:::as.zoo.data.frame, ns = asNamespace("zoo"))
based partially on #smingerson's suggestion in the comments

R best practices: do I need to "unset" a RefClass?

BLUP: What are the risks of creating an environment whose parent is .GlobalEnv, and calling setRefClass within this environment?
I have a package (repository on Github) that loads the contents of an R file provided by HDFql. This wrapper contains a call to setRefClass. After trying a lot of different things (most of which failed) I settled on sourcing the wrapper into an environment that is a child of .GlobalEnv. The environment itself lives in an environment contained in the package. This nesting was required to get around binding errors, because the call to setRefClass fails if it is executed inside an environment whose ancestor is the package namespace. The global environment seemed to be the only environment suitable for the setRefClass evaluation.
However, I'm a bit worried about creating and using an environment in a package whose parent is .GlobalEnv, and making calls to setRefClass inside this environment. What are potential pitfalls when doing this? Are there best practices for removing or "unsetting" the RefClass when finished? Is there a better solution I am not thinking of?
I have included some sample code below, although it is not reproducible; if you want a reproducible example, you can clone the package repository and/or install it with devtools. The code in question lives in the function hql_load() in file connect.r.
# "hql" is an empty environment exported by the package
constants = new.env(parent = .GlobalEnv)
source(constants.file, local = constants)
assign("constants", constants, envir = hql)
where constants.file contains the code
hdfql_cursor_ <- setRefClass("hdfql_cursor_",
field = list(address = "numeric"),
method = list(
finalize = function(){
.Call("_hdfql_cursor_destroy", .self$address, PACKAGE = "HDFqlR")
}
)
)

Resources