RUnit: could not find function "checkEquals" - r

I am creating an R package with the standard directory hierarchy. Inside the R directory, I create a test subdirectory.
In the R directory, I create a uTest.R file containing:
uTest <- function() {
test.suite <- defineTestSuite('test',
dirs = file.path('R/test'))
test.result <- runTestSuite(test.suite)
printTextProtocol(test.result)
}
In the R/test directory, I create a runit.test.R file containing:
test.validDim <- function() {
testFile <- "test/mat.csv"
generateDummyData(testFile,
10,
10)
checkEquals(validDim(testFile), TRUE)
}
I build and install my package using R CMD INSTALL --no-multiarch --with-keep.source RMixtComp in Rstudio. When I try to launch the function uTest(), I get this error message:
1 Test Suite :
test - 1 test function, 1 error, 0 failures
ERROR in test.validDim: Error in func() : could not find function "checkEquals"
However, if I call library(RUnit) prior to calling uTest(), everything works fine. In the import field of the DESCRIPTION file, I added RUnit, and in the NAMESPACE file I added import(RUnit).
How can I call uTest() directly after loading my package, without manually loading RUnit ?

You should not add RUnit to the Depends (or Imports) field in the DESCRIPTION file (despite the comment to the contrary). Doing so implies that the RUnit package is necessary in order to use your package, which is likely not the case. In other words, putting RUnit in Depends or Imports implies RUnit needs to be installed (Imports) and on the users' search path (Depends) in order for them to use your package.
You should add RUnit to the Suggests field in the DESCRIPTION file, then modify your uTest function as below:
uTest <- function() {
stopifnot(requireNamespace("RUnit"))
test.suite <- RUnit::defineTestSuite('test', dirs = file.path('R/test'))
test.result <- RUnit::runTestSuite(test.suite)
RUnit::printTextProtocol(test.result)
}
Doing this allows you to use RUnit for your tests, but does not require users to have RUnit installed (and possibly on their search path) in order to use your package. Obviously, they'll need RUnit if they wish to run your tests.

Related

Config/reticulate not setting up environment - ModuleNotFoundError: No module named 'pandas'

I am trying to build an R package that wraps an internal python module; however, it seems to not be properly installing the dependencies like I would expect. According to the documentation, I should be able to define the Config/reticulate field in the DESCRIPTION file and it will handle setting up the python environment; however, it doesn't seem that's happening.
Reticulate dependency vignette
This is somewhat of a migration project, so currently we're writing R objects to file, then using system(command) to run the python code before reading the results back into R. There are reasons it's done this way, although the plan is to leverage more of the reticulate tools to reduce read/write, I just don't have the time to make those changes right now.
I've replaced system with py_run_file() without success and py_module_available('pandas') returns false after loading the package. If I initialize reticulate in the directory, this resolves the problem, but that doesn't solve my package distribution problem (internal package).
I guess the question is:
How do I ensure reticulate::configure_environment() actually runs when distributing a package?
Below is a rough reproduceable example, making the changes to a RStudio new package template should successfully not work.
/R/hello.R:
RunDemoPy <- function(){
pyScript = system.file("python/pyDemo.py", package = "pyTestpkg")
reticulate::py_run_file(pyScript)
}
DESCRIPTION FIELD:
Config/reticulate:
list(
packages = list(
list(package = "pandas")
)
)
Imports:
reticulate
R/zzz.R
.onLoad <- function(libname, pkgname){
packageStartupMessage("On Load - config environment")
reticulate::configure_environment(pkgname)
}
.onUnload <- function(libname){
}
.onAttach <- function(libname, pkgname){
}
inst/python/pyDemo.py:
import os
import pandas as pd
pdf = pd.DataFrame()
pdf.to_csv("demo.csv")
print("Dataframe done")

Extended R package can't correctly communicate with its 'parent' R package

I am trying to build a package that extends another package. However at its most basic level I am doing something wrong. I build a simple example that presents the same issue:
I have two packages, packageA and packageB. packageA has a single R file in the R folder that reads:
local.env.A <- new.env()
setVal <- function()
{
local.env.A$test <- 1
}
getVal <- function()
{
if(!exists("test", envir = local.env.A)) stop("test does not exist")
return(local.env.A$test)
}
For packageB I have the following single R file in the R folder:
# refers to package A
setVal()
getValinA <- function()
{
return(getVal())
}
I want both packageA and packageB to be available for end users, therefore I set packageB to depend on packageA (in the description file). When packageB is loaded, e.g. by means of library(packageB) I expect it to run setVal() and thus set the test value. However, if I next try to get the value that was set by means of getValinA(), it throws me the stop:
> library(packageB)
Loading required package: PackageA
> getValinA()
Error in getVal() : test does not exist
I am pretty sure it is related to environments, but I am not sure how. Please help!
With thanks to #Roland. The answer was very simple. I was under the impression (assumptions assumptions assumptions!) that when you perform library(packageB) it would load all the actions within it, in my case perform the setVal() function. This is however not the case. If you wish this function to be performed you need to place this within the function .onLoad:
.onLoad <- function(libname, pkgname)
{
setVal()
}
By convention you place this .onload function in an R file called zzz.R. Reason being that if you do not specifically collate your R scripts it will load alphabetically, and it makes sense to perform your actions when at least all the functions in your package are loaded.

testthat error on check() but not on test() because of ~/.Rprofile?

EDIT:
Is it possible that ~/.Rprofile is not loaded on within check(). It looks like my whole process fails since the ~/.Rprofile is not loaded.
DONE EDIT
I have a strange problem on automated testing with testthat. Actually, when I test my package with test() everything works fine. But when I test with check() I get an error message.
The error message says:
1. Failure (at test_DML_create_folder_start_MQ_script.R#43): DML create folder start MQ Script works with "../DML_IC_MQ_DATA/dummy_data" data
capture.output(messages <- source(basename(script_file))) threw an error
Error in sprintf("%s folder got created for each raw file.", subfolder_prefix) :
object 'subfolder_prefix' not found
Before this error I source a script which defines the subfolder_prefix variable and I guess this is why it works in the test() case. But I expected to get this running in the check() function as well.
I will post the complete test script here, hope it is not to complicated:
library(testthat)
context("testing DML create folder and start MQ script")
test_dir <- 'dml_ic_mq_test'
start_dir <- getwd()
# list of test file folders
data_folders <- list.dirs('../DML_IC_MQ_DATA', recursive=FALSE)
for(folder in data_folders) { # for each folder with test files
dir.create(test_dir)
setwd(test_dir)
script_file <- a.DML_prepare_IC.script(dbg_level=Inf) # returns filename I will source
test_that(sprintf('we could copy all files from "%s".',
folder), {
expect_that(
all(file.copy(list.files(file.path('..',folder), full.names=TRUE),
'.',
recursive=TRUE)),
is_true())
})
test_that(sprintf('DML create folder start MQ Script works with "%s" data', folder), {
expect_that(capture.output(messages <- source(basename(script_file))),
not(throws_error()))
})
count_rawfiles <- length(list.files(pattern='.raw$'))
created_folders <- list.dirs(recursive=FALSE)
test_that(sprintf('%s folder got created for each raw file.',
subfolder_prefix), {
expect_equal(length(grep(subfolder_prefix, created_folders)),
count_rawfiles)
})
setwd(start_dir)
unlink(test_dir, recursive=TRUE)
}
In my script I define the variable subfolder_prefix <- 'IC_' and within the test I check if the same number of folders are created for each raw file... This is what my script should do...
So as I said, I am not sure how to debug this problem here since test() works but check() fails during the testthat run.
Now that I know to look in devtools we can find the answer. Per the docs check "automatically builds and checks a source package, using all known best practices". That includes ignoring .Rprofile. It looks like check calls build and that all of that work is done is a separate (clean) R session. In contrast test appears to use your currently running session (in a new environment).

Why does using "<<-" in a function in the global workspace work, but not in a package?

I'm creating a package using devtools and roxygen2 (in RStudio), however after I've built the package my function no longer works as intended. Yet, if I load the function's .R file and run the function from there in RStudio, it works perfectly. I've created another package using this method before and it worked fine (13 functions all working as intended from my other package), yet I cant seem to get this new one to work.
To start creating the package I start with:
library("devtools")
devtools::install_github("klutometis/roxygen")
library(roxygen2)
setwd("my parent directory")
create("triale")
All is working fine so far. So I put my .R file containing my function in the R folder under the triale folder. The .R file looks like this:
#' Trial Z Function
#'
#' This function counts the values in the columns
#' #param x is the number
#' #keywords x
#' #export
#' #examples
#' trialz()
trialz = function(x) {w_id= c(25,x,25,25,25,1,1,1,1,1);
wcenter= c(rep("BYSTAR-1",10));
df1 <<- data.frame(w_id, wcenter);
countit <<- data.table(df1);
view <<- countit[, .N, by = list(w_id, wcenter)];
View(view)}
Again if I were to just run the code from the .R file, and test the function it works fine. But to continue, next I enter:
setwd("./triale")
document()
The triale documentation is updated, triale is loaded, and the NAMESPACE and trialz.Rd are both written so that trialz.Rd is under the man folder, and NAMESPACE is under the triale folder as intended. Next I install triale:
setwd("..")
install("triale")
Which I know works because I get the following:
Installing triale
"C:/PROGRA~1/R/R-31~1.3/bin/x64/R" --vanilla CMD INSTALL \
"C:/Users/grice/Documents/R/triale" \
--library="C:/Users/grice/Documents/R/win-library/3.1" --install-tests
* installing *source* package 'triale' ...
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
*** arch - i386
*** arch - x64
* DONE (triale)
Reloading installed triale
Package is now built, so I do the following:
library("triale")
library("data.table")
Note whenever I load the package data.table I get the following error message:
data.table 1.9.4 For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.
However it doesnt seem to affect my function. So now its time to test my function from my package:
trialz(25)
This goes through, and I of course get a populated df1, and countit, but for whatever reason view is always empty (as in 0 obs. of 0 variables).
So I test my work using the dummy code below:
>trialy = function(x) {wid= c(25,x,25,25,25,1,1,1,1,1);
wc= c(rep("BYSTAR-1",10));
df2 <<- data.frame(wid, wc);
countitt <<- data.table(df2);
viewer <<- countitt[, .N, by = list(wid, wc)];
View(viewer)}
>trialy(25)
Even though this is the same exact code with just the names changed around it works. Dumbfounded I open trialz.R and copy the function from there and run it as below, and that works:
> trialz = function(x) {w_id= c(25,x,25,25,25,1,1,1,1,1);
wcenter= c(rep("BYSTAR-1",10));
df1 <<- data.frame(w_id, wcenter);
countit <<- data.table(df1);
view <<- countit[, .N, by = list(w_id, wcenter)];
View(view)}
> trialz(25)
Since I've created a package before I know my method is solid (that package had 13 dif. functions, all of which worked). I just don't understand how a function can work fine as written, yet when I package it, the function no longer works.
Again here is where it stops working as intended when using my package:
view <<- countit[, .N, by = list(w_id, wcenter)];
View(view)}
And my end result should look something like this, if my package worked:
wid wc N
1 25 BYSTAR-1 5
2 1 BYSTAR-1 5
Can anyone explain why view is never populated after I package my function? I've tested it as much as I know how, and my results should be reproducible for anyone thats willing to try it for themselves.
Thanks, I appreciate any feedback.
Your problem here is that "<<-" does not create variables in the global environment but rather in the parent environment. (See help("<<-").)
The parent environment of a function is the environment in which it has been defined. In the case where you defined your function directly in your workspace, this parent environment actually is the same as your workspace environment (namely: .GlobalEnv), which is why your variables are assigned values as you expect them to. In the case where your function is packaged, however, the parent environment is the package environment and not the .GlobalEnv! This is why you do not see your variables being assigned values in your workspace.
Refer to the chapter on environments in Hadley's book and How R Searches and Finds Stuff for more details on environments in R.
Note that doing this would not be considered a proper debugging technique, to say the least. In general, you never want to use the "<<-" operator.
For options on debugging R code, see, e.g., this question. I, in particular, like the debugonce function very well. See ?debugonce.
I forgot one important part when editing my description file in that I for got to add
Imports: data.table
Also the NAMESPACE file needed to include the data.table package as an import as well, like so:
import(data.table)
export(Z)
export(AS) .... etc.
Doing this ensures that whenever a function within your package uses a function from another package, that (second) package is called up before your code is executed.

Specify output directory for R package generation

I am trying to automize the procedure of package generation but seem to be unable to tell R where to save the newly generated package.
Here a more detailed explanation of my problem:
First I write a function (or multiple functions) and save it as a separate file in a source directory ("C:/Users/Raphael/Documents/Stats/R/Package_Forge/testpack_SourceFiles") that will be used to generate the package. For illustration purposes, I am using the following test function (file: testpack_test.R). As you can see I am using Hadley Wickham’s roxygen package.
#' #rdname f.test
#' #title Test function
#' #description This function squares a given number.
#' #param x Number
#' #return The function returns a number
#' #export
#'
f.test=function(x){
x=x^2
return(x)
}
Then I use the following script to generate the package, which in this example contains only one function (f.test):
#######################
#*** Load packages ***#
#######################
# Set library path
.libPaths("C:/Users/Raphael/Documents/Stats/R/Package_Use")
#install.packages("roxygen2")
library(digest)
library(roxygen2)
###################
#*** Set paths ***#
###################
# Define Path
pkForge="C:/Users/Raphael/Documents/Stats/R/Package_Forge"
pkUse="C:/Users/Raphael/Documents/Stats/R/Package_Use"
newPk=file.path(pkForge,"testpack")
newPkS=file.path(pkForge,"testpack_SourceFiles")
newPkR=file.path(newPk,"R") #"R" folder that will contain functions
newPkD=file.path(newPk,"DESCRIPTION") #Description file
############################################
#*** Generate directories and add files ***#
############################################
# Generate main directory of new package
if(file.exists(newPk)){
cat("\nExisting directory deleted!")
unlink(newPk,recursive=T) #deletes old directory
cat("\nNew directory generated!\n",newPk)
dir.create(newPk)
}else{
cat("\nNew directory generated!\n",newPk)
dir.create(newPk)
}
# Generate "R" sub directory of new package
dir.create(newPkR)
# Add all scripts in the source directory to "R" sub directory
# Note: roxygen code should be used for function annotation
allScripts=list.files(newPkS,"^testpack_.*?\\.R$", full.names=T, ignore.case=T) #uses regex to only select certain files; returns the entire path
file.copy(allScripts, newPkR)
# Generate a new description file in the package main directory
fileConn=file(newPkD,open="w")
writeLines(c("Package: testpack",
"Type: Package",
"Title: Test package",
"Version: 1.0",
"Date: 2013-08-04",
"Author: XYZ",
"Maintainer: XYZ <xyz#gmail.com>",
"Description: This package contains one test function",
"License:GPL-2"),fileConn)
close(fileConn)
# file.show(newPkD) #shows the content of new file
############################
#*** Roxygenize package ***#
############################
# list.files(MyPackages)
roxygenize(newPk)
#######################
#*** Build package ***#
#######################
cmd=paste("R CMD build ", shQuote(newPk)," --no-manual --no-resave-data", sep="")
system(cmd) #using a system call to build the package
This last system call builds the source package correctly. However, the problem is that for some reasons the “tarball” (testpack_1.0.tar.gz) is always saved to C:/Users/Raphael/Documents and I seem to be unable to specify an output directory. I would like to have the tarball saved directly to the pkUse directory ("C:/Users/Raphael/Documents/Stats/R/Package_Use"), which is the folder that I use for all my installed libraries. I tried to add the pkUse directory at various places in the “cmd” string ("R CMD build \"C:/Users/Raphael/Documents/Stats/R/Package_Forge/testpack\" --no-manual --no-resave-data") but it always gives an error. Does anyone have an idea of how to specify the output directory in the above system call? I know that the devtools package is able to do this but would like to be able to use the system call. Thanks so much for any suggestions!
Best,
Raphael
The tarball is being saved to the working directory, so you could setwd() before your system call, then set it back afterwards.
Can you use sink?
sink(x) will write the output to the directory and file format that you need.

Resources