I'm developing unit tests for a model. I use a wrapper function that sources all necessary functions, loads inputs (to calculate the actual output) and loads the expected output. I would like to pass that data along with the test functions in the test directory with test_dir() (the model is not a package, hence not all functionalities in devtools work). However, I'm not able to pass them along. I found a workaround by storing the data in the global environment but this is not best practice. Is there a way to pass them along with the test_dir functionality? Somehow the ellipsis argument does not do the trick for me.
Wrapper code (only works because of the <<- to store in global env):
unittest_4576 <- function(){
# Load packages
library(data.table)
library(giuseppe)
library(plyr)
library(testthat)
library(readxl)
#Set timezone
old_val <- Sys.getenv('TZ')
Sys.setenv(TZ = 'UTC')
# Load all functions in memory
source('~/Repos/tests/unittest_subfunctions/source_all_functions.R')
source_all_functions()
# Load all inputs and expected outputs in global environment
inputs_and_expected_outputs <- load_data()
list_inputs <<- inputs_and_expected_outputs$list_inputs
list_expected_outputs <<- inputs_and_expected_outputs$list_expected_outputs
# Perform unit tests
test_dir(
'~/Repos/tests/testthat/',
filter = NULL,
reporter = NULL,
env = NULL,
inputs = list_inputs,
expected_outputs = list_expected_outputs
)
}
The following code gives is the actual test:
test_create_interest_rate_curves = function(inputs,
expected_outputs,
USE.NAMES = FALSE){
# TEST
# Match outputs
expect_equal(actual_output,expected_output)
}
testthat::test_that("check_result", {test_create_interest_rate_curves(inputs = list_inputs, expected_outputs = list_expected_outputs)})
So to summarize; The inputs and expected outputs that are passed along in the function call are not available in the function. How do I pass them along so that each test is still run in a different environment and the data is not stored in the global environment?
Related
Goal
I want to expose built-in constants from a package I am developing that come originally from C source code, being defined with #define directives.
OpenGL constants defined as #define directives
I'm wrapping this C library GLFW using Rcpp. This C library, in turn, includes OpenGL declarations which includes many #defines, here's a short snippet from gl.h:
#define GL_T4F_C4F_N3F_V4F 0x2A2D
#define GL_MATRIX_MODE 0x0BA0
#define GL_MODELVIEW 0x1700
#define GL_PROJECTION 0x1701
#define GL_TEXTURE 0x1702
#define GL_POINT_SMOOTH 0x0B10
#define GL_POINT_SIZE 0x0B11
#define GL_POINT_SIZE_GRANULARITY 0x0B13
#define GL_POINT_SIZE_RANGE 0x0B12
#define GL_LINE_SMOOTH 0x0B20
Wrapping of C macro definitions in C++ functions
Now, I've been exposing these C macro definitions by wrapping them in c++ functions, e.g.:
// [[Rcpp::export]]
Rcpp::IntegerVector macro_gl_matrix_mode() {return Rcpp::wrap((unsigned int) GL_MATRIX_MODE);}
Exporting the variables
And then, I have an R source file in data-raw/ that essentially calls those not exported functions and saves each object to disk: (abbreviated for clarity):
library(glfw)
library(tibble)
library(usethis)
library(fs)
library(dplyr)
#
# use_data2 accepts a list of strings with the names of the objects to be
# exported instead of the interface provided by usethis::use_data that expects
# multiple arguments passed in `...`.
#
use_data2 <- function(objs,
internal = FALSE,
overwrite = FALSE,
compress = "bzip2",
version = 2,
envir = parent.frame())
{
usethis:::check_is_package("use_data()")
if (internal) {
usethis::use_directory("R")
paths <- fs::path("R", "sysdata.rda")
objs <- list(objs)
}
else {
usethis::use_directory("data")
paths <- fs::path("data", objs, ext = "rda")
}
usethis:::check_files_absent(proj_path(paths), overwrite = overwrite)
usethis::ui_done("Saving {ui_value(unlist(objs))} to {ui_value(paths)}")
mapply(save, list = objs,
file = proj_path(paths),
MoreArgs = list(envir = envir, compress = compress, version = version)
)
invisible()
}
gl <- new.env()
# Begin of loads of assign calls
# (...)
assign('GL_MATRIX_MODE', glfw:::macro_gl_matrix_mode(), envir = gl)
# (...)
# End
#
# Exporting
#
gl_lst <- as.list(gl)
gl_names <- names(gl_lst)
use_data2(gl_names, internal = FALSE, compress = "xz", overwrite = TRUE, version = 2, envir = gl)
This is working but I have 5727 of these constants to be exported. So when I load my package it just stays for more than 5 min loading at this stage:
*** moving datasets to lazyload DB
So there's got to be a better way, right? Not only this is very slow at package loading time as well as I'm guessing that having thousands of objects in my data/ folder is going to create trouble from the package standards or requirements point of view...
Let me just say that I was trying to avoid encapsulating all these constants in a list or dataframe because I wanted to keep the API interface similar to the C library in this respect, i.e., right now I think it is quite nice to be able to simply use the variables GL_MODELVIEW or GL_POINT_SIZE_GRANULARITY straight without any extra syntax.
Any help/pointers is greatly appreciated.
Note: This other question is similar to mine, but it has not an answer yet, and the scope might be slightly different because my constants are originally from C code so there might be a more some specific solution my to problem, for instance, using Rcpp in a way I haven't tried yet: Exporting an unwieldy set of constants in a package.
I had a similar problem. I inherited a project that had a large number of values defined in a file. The app sourced these files to load the data into the global environment. I am converting much of this to a package and wanted these as internal package data. So I did this simple script to create the R/sysdata.rda that is loaded when the package is loaded with "LazyData: true" in the Description file.
#Start with a clean environment
rm(list=ls(all.names = T))
#Data to be saved
strings <- c("a","b")
my_list <- list(first=c(1,2,3), second = seq(1,10))
#Get the names
data_names <- paste0(ls(),collapse =",")
#Create string for execution
command <- paste0("usethis::use_data(" , data_names ,",internal =
TRUE,overwrite=TRUE)")
#execute
eval(parse(text = command))
#cleanup
rm(list=ls(all.names = T))
I am studying the following approach:
Step 1
Instead of exporting all those constants separately, I am going to export an environment that encapsulates all the constants: environment gl (similar to #ralf-stubner's suggestion). This makes the loading (rebuilding of the package) much faster.
So, in my data-raw/gl_macros.R (the data generating script) I am adding this last line to export the environment gl:
usethis::use_data(gl, internal = FALSE, compress = "xz", overwrite = TRUE, version = 2)
Step 2
And then, to have the convenience of accessing the OpenGL macros with their original names, I add a on-attach hook to my R/zzz.R:
.onAttach <- function(libname, pkgname) {
for (n in ls(glfw::gl, all.names = TRUE)) assign(n, get(n, glfw::gl), .GlobalEnv)
} # .onAttach()
It seems to work! At least on an interactive session. This last step takes a few seconds but it's a lot quicker than the original approach. But I am thinking now that this won't work if my package is used by other packages, not sure though.
Alternative to step 2
Perhaps this will work best:
.onLoad <- function(libname, pkgname) {
attach(glfw::gl)
} # .onLoad()
See below for my reprex of my issues with source, <-, <<-, environments, etc.
There's 3 files, testrun.R, which calls inputs.R and CODE.R.
# testrun.R (file 1)
today <<- "abcdef"
source("inputs.R")
for (DC in c("a", "b")) {
usedlater_3 <- paste("X", DC, used_later2)
print(usedlater_3)
source("CODE.R", local = TRUE)
}
final_output <- paste(OD_output, used_later2, usedlater_3)
print(final_output)
# #---- file 2
# # inputs.R
# used_later1 <- paste(today, "_later")
# used_later2 <- "l2"
#
# #---- file 3
# # CODE.R
# OD_output <- paste(DC, today, used_later1, usedlater_2, usedlater_3)
I'm afraid I didn't learn R or CS in a proper way so I'm trying to catch up now. Any bigger picture lessons would be helpful. Previously, I've been relying on a global environment where I keep everything (and save/keep between sessions), but now I'm trying to make everything reproducible, so I'm using RStudio to run local jobs that start from scratch.
I've been trying different combinations of <-, <<-, and source(local = TRUE) (instead of local = FALSE). I do use functions for pieces of code where I know the inputs I need and outputs I want, but as you can see, CODE.R uses variables from both testrun.R, the loop inside testrun.R, and input.R. Converting some of the code into functions might help ? but I'd like to know of alternatives as well given this case.
Finally you can see my own troubleshooting log to see my thought process:
first run: variable today wasn't found, so I made today <<- "abcdef" double arrow assignment
second run: DC not found, so I will switch to local = TRUE
third run: but now usedlater_2 not found, so i will change usedlater_2 to <<-. (what about usedlater_1? why didn't this show up as error? we'll see...)
result of third run: usedlater_2 still not found when CODE.R needs it. out of ideas. note: used_later2 was found to create used_later3 in the for loop in testrun.R.
I'm building a package which includes my shiny app.
To do this, I build a wrapper around my "shiny::runApp" call, but unfortunatly the shiny app uses the global workspace variables.
I want the wraper function to use the variables i give to the function (and error if not supplemented) and use them for shiny. Here for example, it need x,y and z (which has a default value):
Shiny_wrapper <- function(x,y,z=TRUE){
shiny::runApp(appDir = system.file("shinyApp", package = "WebFlood"))
}
I worked around it by assigning the variables to the global workspace, but I don't think this is the right approach:
Shiny_wrapper <- function(x,y,z=TRUE){
x<<-x
y<<-y
z<<-z
shiny::runApp(appDir = system.file("shinyApp", package = "WebFlood"))
}
How do I get my shiny to use the variables I pass to the wrapper?
You can define an environment in your package, and use it to pass some variables.
PKGENVIR <- new.env(parent=emptyenv())
#' #export
Shiny_wrapper <- function(x,y,z=TRUE){
PKGENVIR$x <- x
PKGENVIR$y <- y
PKGENVIR$z <- z
shiny::runApp(appDir = system.file("shinyApp", package = "WebFlood"))
}
And then in the shiny app (in global.R or server.R):
x <- WebFlood:::PKGENVIR$x
I would like to find environments and get them as variables. My goal is to be able to do some subsequent calls to sys.source even though I lost track of the environment as a variable, but I know it's name.
Example:
MyFuns <- attach(NULL, name = 'Myfuns')
sys.source('myFunctions.R', envir = Myfuns)
rm('MyFuns')
any(grepl('MyFuns', search())) # It is there
sys.source('oneMoreFunction.R', envir = Myfuns) # Will not work because the variable as been suppressed.
Thanks!
Since you've attached the environment, you can find it again with as.environment:
attach(NULL, name = "Myfuns")
assign("a", 1, env = as.environment("Myfuns"))
get("a", env = as.environment("Myfuns"))
sys.source('myFunctions.R', envir = as.environment("Myfuns"))
You may also want to consider making a package and then using devtools::load_all() to load the code - it will also load code, compile C code, respect NAMESPACE, load other required packages etc.
I know I can use ls() and rm() to see and remove objects that exist in my environment.
However, when dealing with "old" .RData file, one needs to sometimes pick an environment a part to find what to keep and what to leave out.
What I would like to do, is to have a GUI like interface to allow me to see the objects, sort them (for example, by there size), and remove the ones I don't need (for example, by a check-box interface). Since I imagine such a system is not currently implemented in R, what ways do exist? What do you use for cleaning old .RData files?
Thanks,
Tal
I never create .RData files. If you are practicing reproducible research (and you should be!) you should be able to source in R files to go from input data files to all outputs.
When you have operations that take a long time it makes sense to cache them. If often use a construct like:
if (file.exists("cache.rdata")) {
load("cache.rdata")
} else {
# do stuff ...
save(..., file = "cache.rdata")
}
This allows you to work quickly from cached files, and when you need to recalculate from scratch you can just delete all the rdata files in your working directory.
Basic solution is to load your data, remove what you don't want and save as new, clean data.
Another way to handle this situation is to control loaded RData by loading it to own environment
sandbox <- new.env()
load("some_old.RData", sandbox)
Now you can see what is inside
ls(sandbox)
sapply(ls(sandbox), function(x) object.size(get(x,sandbox)))
Then you have several posibilities:
write what you want to new RData: save(A, B, file="clean.RData", envir=sandbox)
remove what you don't want from environment rm(x, z, u, envir=sandbox)
make copy of variables you want in global workspace and remove sandbox
I usually do something similar to third option. Load my data, do some checks, transformation, copy final data to global workspace and remove environments.
You could always implement what you want. So
Load the data
vars <- load("some_old.RData")
Get sizes
vars_size <- sapply(vars, function(x) object.size(get(x)))
Order them
vars <- vars[order(vars_size, decreasing=TRUE)]
vars_size <- vars_size [order(vars_size, decreasing=TRUE)]
Make dialog box (depends on OS, here is Windows)
vars_with_size <- paste(vars,vars_size)
vars_to_save <- select.list(vars_with_size, multiple=TRUE)
Remove what you don't want
rm(vars[!vars_with_size%in%vars_to_save])
To nice form of object size I use solution based on getAnywhere(print.object_size)
pretty_size <- function(x) {
ifelse(x >= 1024^3, paste(round(x/1024^3, 1L), "Gb"),
ifelse(x >= 1024^2, paste(round(x/1024^2, 1L), "Mb"),
ifelse(x >= 1024 , paste(round(x/1024, 1L), "Kb"),
paste(x, "bytes")
)))
}
Then in 4. one can use paste(vars, pretty_size(vars_size))
You may want to check out the RGtk2 package.
You can very easily create an interface with Glade Interface Designer and then attach whatever R commands you want to it.
If you want a good starting point where to "steal" ideas on how to use RGtk2, install the rattle package and run rattle();. Then look at the source code and start making your own interface :)
I may have a go at it and see if I can come out with something simple.
EDIT: this is a quick and dirty piece of code that you can play with. The big problem with it is that for whatever reason the rm instruction does not get executed, but I'm not sure why... I know that it is the central instruction, but at least the interface works! :D
TODO:
Make rm work
I put all the variables in the remObjEnv environment. It should not be listed in the current variable and it should be removed when the window is closed
The list will only show objects in the global environment, anything inside other environment won't be shown, but that's easy enough to implement
probably there's some other bug I haven't thought of :D
Enjoy
# Our environment
remObjEnv <<- new.env()
# Various required libraries
require("RGtk2")
remObjEnv$createModel <- function()
{
# create the array of data and fill it in
remObjEnv$objList <- NULL
objs <- objects(globalenv())
for (o in objs)
remObjEnv$objList[[length(remObjEnv$objList)+1]] <- list(object = o,
type = typeof(get(o)),
size = object.size(get(o)))
# create list store
model <- gtkListStoreNew("gchararray", "gchararray", "gint")
# add items
for (i in 1:length(remObjEnv$objList))
{
iter <- model$append()$iter
model$set(iter,
0, remObjEnv$objList[[i]]$object,
1, remObjEnv$objList[[i]]$type,
2, remObjEnv$objList[[i]]$size)
}
return(model)
}
remObjEnv$addColumns <- function(treeview)
{
colNames <- c("Name", "Type", "Size (bytes)")
model <- treeview$getModel()
for (n in 1:length(colNames))
{
renderer <- gtkCellRendererTextNew()
renderer$setData("column", n-1)
treeview$insertColumnWithAttributes(-1, colNames[n], renderer, text=n-1)
}
}
# Builds the list.
# I seem to have some problems in correctly build treeviews from glade files
# so we'll just do it by hand :)
remObjEnv$buildTreeView <- function()
{
# create model
model <- remObjEnv$createModel()
# create tree view
remObjEnv$treeview <- gtkTreeViewNewWithModel(model)
remObjEnv$treeview$setRulesHint(TRUE)
remObjEnv$treeview$getSelection()$setMode("single")
remObjEnv$addColumns(remObjEnv$treeview)
remObjEnv$vbox$packStart(remObjEnv$treeview, TRUE, TRUE, 0)
}
remObjEnv$delObj <- function(widget, treeview)
{
model <- treeview$getModel()
selection <- treeview$getSelection()
selected <- selection$getSelected()
if (selected[[1]])
{
iter <- selected$iter
path <- model$getPath(iter)
i <- path$getIndices()[[1]]
model$remove(iter)
}
obj <- as.character(remObjEnv$objList[[i+1]]$object)
rm(obj)
}
# The list of the current objects
remObjEnv$objList <- NULL
# Create the GUI.
remObjEnv$window <- gtkWindowNew("toplevel", show = FALSE)
gtkWindowSetTitle(remObjEnv$window, "R Object Remover")
gtkWindowSetDefaultSize(remObjEnv$window, 500, 300)
remObjEnv$vbox <- gtkVBoxNew(FALSE, 5)
remObjEnv$window$add(remObjEnv$vbox)
# Build the treeview
remObjEnv$buildTreeView()
remObjEnv$button <- gtkButtonNewWithLabel("Delete selected object")
gSignalConnect(remObjEnv$button, "clicked", remObjEnv$delObj, remObjEnv$treeview)
remObjEnv$vbox$packStart(remObjEnv$button, TRUE, TRUE, 0)
remObjEnv$window$showAll()
Once you've figured out what you want to keep, you can use the function -keep- from package gdata does what its name suggests.
a <- 1
b <- 2
library(gdata)
keep(a, all = TRUE, sure = TRUE)
See help(keep) for details on the -all- and -sure- options.
all: whether hidden objects (beginning with a .) should be removed, unless explicitly kept.
sure: whether to perform the removal, otherwise return names of objects that would have been removed.
This function is so useful that I'm surprised it isn't part of R itself.
The OS X gui does have such a thing, it's called the Workspace Browser. Quite handy.
I've also wished for an interface that shows the session dependency between objects, i.e. if I start from a plot() and work backwards to find all the objects that were used to create it. This would require parsing the history.
It doesn't have checkboxes to delete with, rather you select the file(s) then click delete. However, the solution below is pretty easy to implement:
library(gWidgets)
options(guiToolkit="RGtk2")
## make data frame with files
out <- lapply((x <- list.files()), file.info)
out <- do.call("rbind", out)
out <- data.frame(name=x, size=as.integer(out$size), ## more attributes?
stringsAsFactors=FALSE)
## set up GUI
w <- gwindow("Browse directory")
g <- ggroup(cont=w, horizontal=FALSE)
tbl <- gtable(out, cont=g, multiple=TRUE)
size(tbl) <- c(400,400)
deleteThem <- gbutton("delete", cont=g)
enabled(deleteThem) <- FALSE
## add handlers
addHandlerClicked(tbl, handler=function(h,...) {
enabled(deleteThem) <- (length(svalue(h$obj, index=TRUE)) > 0)
})
addHandlerClicked(deleteThem, handler=function(h,...) {
inds <- svalue(tbl, index=TRUE)
files <- tbl[inds,1]
print(files) # replace with rm?
})
The poor guy answer could be :
ls()
# spot the rank of the variables you want to remove, for example 10 to 25
rm(list= ls()[[10:25]])
# repeat until satisfied
To clean the complete environment you can try:
rm(list(ls())