I would like to find environments and get them as variables. My goal is to be able to do some subsequent calls to sys.source even though I lost track of the environment as a variable, but I know it's name.
Example:
MyFuns <- attach(NULL, name = 'Myfuns')
sys.source('myFunctions.R', envir = Myfuns)
rm('MyFuns')
any(grepl('MyFuns', search())) # It is there
sys.source('oneMoreFunction.R', envir = Myfuns) # Will not work because the variable as been suppressed.
Thanks!
Since you've attached the environment, you can find it again with as.environment:
attach(NULL, name = "Myfuns")
assign("a", 1, env = as.environment("Myfuns"))
get("a", env = as.environment("Myfuns"))
sys.source('myFunctions.R', envir = as.environment("Myfuns"))
You may also want to consider making a package and then using devtools::load_all() to load the code - it will also load code, compile C code, respect NAMESPACE, load other required packages etc.
Related
I'm developing unit tests for a model. I use a wrapper function that sources all necessary functions, loads inputs (to calculate the actual output) and loads the expected output. I would like to pass that data along with the test functions in the test directory with test_dir() (the model is not a package, hence not all functionalities in devtools work). However, I'm not able to pass them along. I found a workaround by storing the data in the global environment but this is not best practice. Is there a way to pass them along with the test_dir functionality? Somehow the ellipsis argument does not do the trick for me.
Wrapper code (only works because of the <<- to store in global env):
unittest_4576 <- function(){
# Load packages
library(data.table)
library(giuseppe)
library(plyr)
library(testthat)
library(readxl)
#Set timezone
old_val <- Sys.getenv('TZ')
Sys.setenv(TZ = 'UTC')
# Load all functions in memory
source('~/Repos/tests/unittest_subfunctions/source_all_functions.R')
source_all_functions()
# Load all inputs and expected outputs in global environment
inputs_and_expected_outputs <- load_data()
list_inputs <<- inputs_and_expected_outputs$list_inputs
list_expected_outputs <<- inputs_and_expected_outputs$list_expected_outputs
# Perform unit tests
test_dir(
'~/Repos/tests/testthat/',
filter = NULL,
reporter = NULL,
env = NULL,
inputs = list_inputs,
expected_outputs = list_expected_outputs
)
}
The following code gives is the actual test:
test_create_interest_rate_curves = function(inputs,
expected_outputs,
USE.NAMES = FALSE){
# TEST
# Match outputs
expect_equal(actual_output,expected_output)
}
testthat::test_that("check_result", {test_create_interest_rate_curves(inputs = list_inputs, expected_outputs = list_expected_outputs)})
So to summarize; The inputs and expected outputs that are passed along in the function call are not available in the function. How do I pass them along so that each test is still run in a different environment and the data is not stored in the global environment?
Goal
I want to expose built-in constants from a package I am developing that come originally from C source code, being defined with #define directives.
OpenGL constants defined as #define directives
I'm wrapping this C library GLFW using Rcpp. This C library, in turn, includes OpenGL declarations which includes many #defines, here's a short snippet from gl.h:
#define GL_T4F_C4F_N3F_V4F 0x2A2D
#define GL_MATRIX_MODE 0x0BA0
#define GL_MODELVIEW 0x1700
#define GL_PROJECTION 0x1701
#define GL_TEXTURE 0x1702
#define GL_POINT_SMOOTH 0x0B10
#define GL_POINT_SIZE 0x0B11
#define GL_POINT_SIZE_GRANULARITY 0x0B13
#define GL_POINT_SIZE_RANGE 0x0B12
#define GL_LINE_SMOOTH 0x0B20
Wrapping of C macro definitions in C++ functions
Now, I've been exposing these C macro definitions by wrapping them in c++ functions, e.g.:
// [[Rcpp::export]]
Rcpp::IntegerVector macro_gl_matrix_mode() {return Rcpp::wrap((unsigned int) GL_MATRIX_MODE);}
Exporting the variables
And then, I have an R source file in data-raw/ that essentially calls those not exported functions and saves each object to disk: (abbreviated for clarity):
library(glfw)
library(tibble)
library(usethis)
library(fs)
library(dplyr)
#
# use_data2 accepts a list of strings with the names of the objects to be
# exported instead of the interface provided by usethis::use_data that expects
# multiple arguments passed in `...`.
#
use_data2 <- function(objs,
internal = FALSE,
overwrite = FALSE,
compress = "bzip2",
version = 2,
envir = parent.frame())
{
usethis:::check_is_package("use_data()")
if (internal) {
usethis::use_directory("R")
paths <- fs::path("R", "sysdata.rda")
objs <- list(objs)
}
else {
usethis::use_directory("data")
paths <- fs::path("data", objs, ext = "rda")
}
usethis:::check_files_absent(proj_path(paths), overwrite = overwrite)
usethis::ui_done("Saving {ui_value(unlist(objs))} to {ui_value(paths)}")
mapply(save, list = objs,
file = proj_path(paths),
MoreArgs = list(envir = envir, compress = compress, version = version)
)
invisible()
}
gl <- new.env()
# Begin of loads of assign calls
# (...)
assign('GL_MATRIX_MODE', glfw:::macro_gl_matrix_mode(), envir = gl)
# (...)
# End
#
# Exporting
#
gl_lst <- as.list(gl)
gl_names <- names(gl_lst)
use_data2(gl_names, internal = FALSE, compress = "xz", overwrite = TRUE, version = 2, envir = gl)
This is working but I have 5727 of these constants to be exported. So when I load my package it just stays for more than 5 min loading at this stage:
*** moving datasets to lazyload DB
So there's got to be a better way, right? Not only this is very slow at package loading time as well as I'm guessing that having thousands of objects in my data/ folder is going to create trouble from the package standards or requirements point of view...
Let me just say that I was trying to avoid encapsulating all these constants in a list or dataframe because I wanted to keep the API interface similar to the C library in this respect, i.e., right now I think it is quite nice to be able to simply use the variables GL_MODELVIEW or GL_POINT_SIZE_GRANULARITY straight without any extra syntax.
Any help/pointers is greatly appreciated.
Note: This other question is similar to mine, but it has not an answer yet, and the scope might be slightly different because my constants are originally from C code so there might be a more some specific solution my to problem, for instance, using Rcpp in a way I haven't tried yet: Exporting an unwieldy set of constants in a package.
I had a similar problem. I inherited a project that had a large number of values defined in a file. The app sourced these files to load the data into the global environment. I am converting much of this to a package and wanted these as internal package data. So I did this simple script to create the R/sysdata.rda that is loaded when the package is loaded with "LazyData: true" in the Description file.
#Start with a clean environment
rm(list=ls(all.names = T))
#Data to be saved
strings <- c("a","b")
my_list <- list(first=c(1,2,3), second = seq(1,10))
#Get the names
data_names <- paste0(ls(),collapse =",")
#Create string for execution
command <- paste0("usethis::use_data(" , data_names ,",internal =
TRUE,overwrite=TRUE)")
#execute
eval(parse(text = command))
#cleanup
rm(list=ls(all.names = T))
I am studying the following approach:
Step 1
Instead of exporting all those constants separately, I am going to export an environment that encapsulates all the constants: environment gl (similar to #ralf-stubner's suggestion). This makes the loading (rebuilding of the package) much faster.
So, in my data-raw/gl_macros.R (the data generating script) I am adding this last line to export the environment gl:
usethis::use_data(gl, internal = FALSE, compress = "xz", overwrite = TRUE, version = 2)
Step 2
And then, to have the convenience of accessing the OpenGL macros with their original names, I add a on-attach hook to my R/zzz.R:
.onAttach <- function(libname, pkgname) {
for (n in ls(glfw::gl, all.names = TRUE)) assign(n, get(n, glfw::gl), .GlobalEnv)
} # .onAttach()
It seems to work! At least on an interactive session. This last step takes a few seconds but it's a lot quicker than the original approach. But I am thinking now that this won't work if my package is used by other packages, not sure though.
Alternative to step 2
Perhaps this will work best:
.onLoad <- function(libname, pkgname) {
attach(glfw::gl)
} # .onLoad()
Scenario: Let's say I have a master pipeline.R script as follows:
WORKINGDIR="my/master/dir"
setwd(WORKINDIR)
# Step 1
tA = Sys.time()
source("step1.R")
difftime(Sys.time(), tA)
# Add as many steps as desired, ...
And suppose that, within step1.R happens a:
rm(list=ls())
Question:
How can I separate the pipeline.R (caller) environment from the step1.R environment?
More specifically, I would like to run step1.R in separate environment such that any code within it, like the rm, does not affect the caller environment.
There are a few ways to call a R script and run it. One of them would be source().
Source evaluates the r script and does so in a certain environment if called so.
Say we have a Test.R script:
#Test.R
a <- 1
rm(list = ls())
b <- 2
c <- 3
and global variables:
a <- 'a'
b <- 'b'
c <- 'c'
Now you would like to run this script, but in a certain environment not involving the global environment you are calling the script from. You can do this by doing creating a new environment and then calling source:
step1 <- new.env(parent = baseenv())
#Working directory set correctly.
source("Test.R", local = step1)
These are the results after the run, as you can see, the symbols in the global environment are not deleted.
a
#"a"
b
#"b"
step1$a
#NULL
#rm(list = ls()) actually ran in Test.R
step1$b
#2
Note:
You can also run a R script by using system. This will, however, be ran in a different R process and you will not be able to retrieve anything from where you called the script.
system("Rscript Test.R")
We create the new.env
e1 <- new.env()
and use sys.source to source the R script with envir specifying as the 'e1' above
sys.source("step1.R", envir=e1)
If I were to use a configuration file in a normal R script, I would do this:
config.R
a <- 1
b <- 2
c <- 3
RScript
source('config.R')
d = a+b+c
Do stuff
How would I do this inside an R package? Can I keep a config file and source it inside an R function? Or should I include a,b,c in every function? What's the best practice?
If the configs shall be contained in the R package itself:
Store the config file(s) in the inst/configs folder.
After the package installation the configs are contained in the configs folder of the package location (libPaths())
Source the config file using the package installation directory from within a package function:
myPackage::load_config <- function(config_file_name = "default_config.R",
config_file_path = system.file("configs", package = getPackageName(), mustWork = TRUE))
{
env <- new.env() # all values are then contained in an separate environment
# env <- globalenv() # to make the variables visible in the client's environment
config_file_FQN <- file.path(config_file_path, config_file_name)
source(config_file_FQN, local = env, keep.source = TRUE)
return(env)
}
The client can then trigger the configuration and use it (eg. pass around)
# client call
myConf <- myPackage::load_config()
print(myConf$YourVariableName))
Or store the environment with the configured variables within the package
as a package-global variable, see this example code (sorry, too much to explain here):
https://github.com/aryoda/tryCatchLog/blob/master/R/zzz.R#L47
1: One option would be to have these as default values in your functions. As in
my_fun <- function(..., a = 1, b = 2) so on.
2: Given that what you have in a package is functions, you can easily have them declared in your main functions. So, the other functions being called by these have access to them.
3: Another option would be to keep them as functions.
a <- function()
a <- 1
Now you can call a() when ever you want, as in a() + 2.
4: Another option would be to use environments. I haven't use those much. I think you'll find this useful, in particular the section on Package state.
I am in the process of putting together a package I've been working on for almost a year now. I have what I call a hash table that a syllable look up function requires. The hash table is really just an environment (I think I'm not computer whiz) that's a look up table. You can see the function I create it with below. I have a data set DICTIONARY(about 20,000 words) that will load when the package is loaded. I also what this DICTIONARY to be passed to the hash function to create a new environment when the package is loaded; something like env <- hash(DICTIONARY) as htis is how I load the environment now. How do I make a function run on start up when the package is loaded so that this new environment is created for those using my package?
hash <- function(x, type = "character") {
e <- new.env(hash = TRUE, size = nrow(x), parent = emptyenv())
char <- function(col) assign(col[1], as.character(col[2]), envir = e)
num <- function(col) assign(col[1], as.numeric(col[2]), envir = e)
FUN <- if(type=="character") char else num
apply(x, 1, FUN)
return(e)
}
#currently how I load the environment with the DICTIONARY lookup table
env <- hash(DICTIONARY)
Here's the head of DICTIONARY if it's helpful:
word syllables
1 hm 1
2 hmm 1
3 hmmm 1
4 hmph 1
5 mmhmm 2
6 mmhm 2
7 mm 1
8 mmm 1
9 mmmm 1
10 pff 1
Many of you may be thinking "This is up to the user to determine if they want the environment loaded". Valid point but the intended audience of this package is people in the literacy field. Not many in that field are R users and so I have to make this thing as easy as possible to use. Just wanted to get out the philosophy of why I want to do this, out there so that it doesn't become a point of contention.
Thank you in advance. (PS I've looked at this manual (LINK) but can't seem to locate any info about this topic)
EDIT:
Per Andrei's suggestion i think it will be something like this? But I'm not sure. Does this load after all the other functions and data sets in the package load? This stuff is a little confusing to me.
.onLoad <- function(){
env <- hash(DICTIONARY)
}
If the hash is going to change infrequently (this seems like the case, from your problem description), then save the hash into your package source tree as
save(env, file="<my_pkg>/R/sysdata.rda")
After installing the package, env will be available inside the name space, my_pkg:::env. See section 1.1.3 of "Writing R Extensions". You might have a script, say in "/inst/scripts/make_env.R" that creates env, and that you as the developer use on those rare occasions when env needs to be updated.
Another possibility is that the hash changes, but only on package installation. Then the solution is to write code that is evaluated at package installation. So in a file /R/env.R write something along the lines of
env <- local({
localenv <- new.env(parent=emptyenv())
## fill up localenv, then return it
localenv[["foo"]] = "bar"
localenv
})
The possibility solved by .onLoad is that the data changes each time the package is loaded, e.g., because it is retrieving an update from some on-line source.
env <- new.env(parent=emptyenv())
.onLoad <- function(libname, pkgname)
{
## fill up env
env[["foo"]] = "bar"
}