Convenience function for exporting objects to the global environment - r

UPDATE: I have added a variant
of Roland's implementation to the kimisc package.
Is there a convenience function for exporting objects to the global environment, which can be called from a function to make objects available globally?
I'm looking for something like
export(obj.a, obj.b)
which would behave like
assign("obj.a", obj.a, .GlobalEnv)
assign("obj.b", obj.b, .GlobalEnv)
Rationale
I am aware of <<- and assign. I need this to refactor oldish code which is simply a concatenation of scripts:
input("script1.R")
input("script2.R")
input("script3.R")
script2.R uses results from script1.R, and script3.R potentially uses results from both 1 and 2. This creates a heavily polluted namespace, and I wanted to change each script
pollute <- the(namespace)
useful <- result
to
(function() {
pollute <- the(namespace)
useful <- result
export(useful)
})()
as a first cheap countermeasure.

Simply write a wrapper:
myexport <- function(...) {
arg.list <- list(...)
names <- all.names(match.call())[-1]
for (i in seq_along(names)) assign(names[i],arg.list[[i]],.GlobalEnv)
}
fun <- function(a) {
ttt <- a+1
ttt2 <- a+2
myexport(ttt,ttt2)
return(a)
}
print(ttt)
#object not found error
fun(2)
#[1] 2
print(ttt)
#[1] 3
print(ttt2)
#[1] 4
Not tested thoroughly and not sure how "safe" that is.

You can create an environment variable and use it within your export function. For example:
env <- .GlobalEnv ## better here to create a new one :new.env()
exportx <- function(x)
{
x <- x+1
env$y <- x
}
exportx(3)
y
[1] 4
For example , If you want to define a global options(emulate the classic R options) in your package ,
my.options <- new.env()
setOption1 <- function(value) my.options$Option1 <- value
EDIT after OP clarification:
You can use evalq which take 2 arguments :
envir the environment in which expr is to be evaluated
enclos where R looks for objects not found in envir.
Here an example:
env.script1 <- new.env()
env.script2 <- new.env()
evalq({
x <- 2
p <- 3
z <- 5
} ,envir = env.script1,enclos=.GlobalEnv)
evalq({
h <- x +2
} ,envir = env.script2,enclos=myenv.script1)`
You can see that all variable are created within the environnment ( like local)
env.script2$h
[1] 4
env.script1$p
[1] 3
> env.script1$x
[1] 2

First, given your use case, I don't see how an export function is any better than using good (?) old-fashioned <<-. You could just do
(function() {
pollute <- the(namespace)
useful <<- result
})()
which will give the same result as what's in your example.
Second, rather than anonymous functions, it seems better form to use local, which allows you to run involved computations without littering your workspace with various temporary objects.
local({
pollute <- the(namespace)
useful <<- result
})
ETA: If it's important for whatever reason to avoid modifying an existing variable called useful, put an exists check in there. The same applies to the other solutions presented.
local({
.....
useful <- result
if(!exists("useful", globalenv())) useful <<- useful
})

Related

Can I access the last computed result before a function 'stop's?

Consider this code:
bad_function <- function() {
# a lot of code
x <- 1
stop("error")
}
tryCatch(bad_function(), error = function(cond) {x})
Obviously, x is not accessible in the error handler. But is there another way to access the value of x without changing bad_function? Alternatively, is there a way to patch bad_function to skip over stop("error") and return x without having to copy all that # a lot of code?
This works if the result you are looking for is named (and the you know the name - here, x):
bad_function <- function() {
# a lot of code
x <- 1
stop("error")
}
.old_stop <- base::stopifnot
.new_stop <- function(...) {
parent.frame()$x
}
assignInNamespace("stop", .new_stop, "base")
bad_function()
assignInNamespace("stop", .old_stop, "base")
I still wonder if there are better solutions.
You could assign the value simultaneously to x in the function environment, as well to another x in an external say debug environment that you defined beforehand.
ev1 <- new.env()
bad_function <- function() {
env <- new.env(parent=baseenv())
# a lot of code
x <- ev1$x <- 1
stop("error")
}
tryCatch(bad_function(), error = function(e) ev1$x)
# [1] 1
The advantage is that .GlobalEnv stays clear (apart from the environment of course).
ls()
# [1] "bad_function" "ev1"

Define function in environment that changes an object in the environment

I would like to write a function that returns an environment containing a function which assigns the value of an object inside the environment. For example, what I want to do is:
makeenv <- function() {
e <- new.env(parent = .GlobalEnv)
e$x <- 0
e$setx <- function(k) { e$x <- k } # NOT OK
e
}
I would like to fix the e$setx function above. The behavior of the above is weird to me:
e1 <- makeenv()
e1$x
## [1] 0
e1$setx
## function(k) e$x <- k
## <environment: 0x7f96144d8240>
e1$setx(3) # Strangely, this works.
e1$x
## [1] 3
# --------- clone ------------
e2 <- new.env(parent = .GlobalEnv)
e2$x <- e1$x
e2$setx <- e1$setx
e2$x
## [1] 3
# ----- e2$setx() changes e1$x -----
e2$setx(7) # HERE
e2$x # e2$x is not changed.
## [1] 3
e1$x # e1$x is changed instead.
## [1] 7
Could someone please help me understand what is going on here? I especially don't understand why e2$setx(7) sets e1$x to 7 rather than issuing an error. I think I am doing something very wrong here.
I would also like to write a correct function e$setx inside the makeenv function that correctly assigns a value to the x object in the environment e. Would it be possible to have one without using S4 or R6 classes? I know that a function like setx <- function(e,k) { e$x <- k } works, but to me e1$setx(5) looks more intuitive than setx(e1,5) and I would like to investigate this possibility first. Is it possible to have something like e$setx <- function(k) { self$x <- k }, say, where self refers to the e preceding the $?
This page The equivalent of 'this' or 'self' in R looks relevant, but I like to have the effect without using S4 or R6. Or am I trying to do something impossible? Thank you.
You can use local to evaluate the function definition in the environment:
local(etx <- function (k) e$x <- k, envir = e)
Alternatively, you can also change the function’s environment after the fact:
e$setx <- function(k) e$x <- k
environment(e$setx) <- e
… but neither is strictly necessary in your case, since you probably don’t need to create a brand new environment. Instead, you can reuse the current calling environment. Doing this is a very common pattern:
makeenv <- function() {
e <- environment()
x <- 0
setx <- function(k) e$x <- k
e
}
Instead of e$x <- k you could also write e <<- k; that way, you don’t need the e variable at all:
makeenv <- function() {
x <- 0
setx <- function(k) x <<- k
environment()
}
… however, I actually recommend against this: <<- is error-prone because it looks for assignment targets in all parent environments; and if it can’t find any, it creates a new variable in the global environment. It’s better to explicitly specify the assignment target.
However, note that none of the above changes the observed semantics of your code: when you copy the function into a new environment, it retains its old environment! If you want to “move over” the function, you explicitly need to reassign its environment:
e2$setx <- e1$setx
environment(e2$setx) <- e2
… of course writing that entire code manually is pretty error-prone. If you want to create value semantics (“deep copy” semantics) for an environment in R, you should wrap this functionality into a function:
copyenv <- function (e) {
new_env <- list2env(as.list(e, all.names = TRUE), parent = parent.env(e), hash = TRUE)
new_env$e <- new_env
environment(new_env$setx) <- new_env
new_env
}
e2 <- copyenv(e1)
Note that the copyenv function is not trying to be general; it needs to be adapted for other environment structures. There is no good, general way of writing a deep-copy function for environments that handles all cases, since a general function can’t know how to handle self-references (i.e. the e in the above): in your case, you want to preserve self-references (i.e. change them to point to the new environment). But in other cases, the reference might need to point to something else.
This is a general problem of deep copying. That’s why the R serialize function, for instance, has refHook parameter that tells the function how to serialise environment references.

Anonymous passing of variables from current environment to subfunction calls

The function testfun1, defined below, does what I want it to do. (For the reasoning of all this, see the background info below the code example.) The question I wanted to ask you is why what I tried in testfun2 doesn't work. To me, both appear to be doing the exact same thing. As shown by the print in testfun2, the evaluation of the helper function inside testfun2 takes place in the correct environment, but the variables from the main function environment get magically passed to the helper function in testfun1, but not in testfun2. Does anyone of you know why?
helpfun <- function(){
x <- x^2 + y^2
}
testfun1 <- function(x,y){
xy <- x*y
environment(helpfun) <- sys.frame(sys.nframe())
x <- eval(as.call(c(as.symbol("helpfun"))))
return(list(x=x,xy=xy))
}
testfun1(x = 2,y = 1:3)
## works as intended
eval.here <- function(fun){
environment(fun) <- parent.frame()
print(environment(fun))
eval(as.call(c(as.symbol(fun))))
}
testfun2 <- function(x,y){
print(sys.frame(sys.nframe()))
xy <- x*y
x <- eval.here("helpfun")
return(list(x=x,xy=xy))
}
testfun2(x = 2,y = 1:3)
## helpfun can't find variable 'x' despite having the same environment as in testfun1...
Background info: I have a large R code in which I want to call helperfunctions inside my main function. They alter variables of the main function environment. The purpose of all this is mainly to unclutter my code. (Main function code is currently over 2000 lines, with many calls to various helperfunctions which themselves are 40-150 lines long...)
Note that the number of arguments to my helper functions is very high, so that the traditional explicit passing of function arguments ( "helpfun(arg1 = arg1, arg2 = arg2, ... , arg50 = arg50)") would be cumbersome and doesnt yield the uncluttering of the code that I am aiming for. Therefore, I need to pass the variables from the parent frame to the helper functions anonymously.
Use this instead:
eval.here <- function(fun){
fun <- get(fun)
environment(fun) <- parent.frame()
print(environment(fun))
fun()
}
Result:
> testfun2(x = 2,y = 1:3)
<environment: 0x0000000013da47a8>
<environment: 0x0000000013da47a8>
$x
[1] 5 8 13
$xy
[1] 2 4 6

How to prevent namespace pollution in R [duplicate]

This is probably not correct terminology, but hopefully I can get my point across.
I frequently end up doing something like:
myVar = 1
f <- function(myvar) { return(myVar); }
# f(2) = 1 now
R happily uses the variable outside of the function's scope, which leaves me scratching my head, wondering how I could possibly be getting the results I am.
Is there any option which says "force me to only use variables which have previously been assigned values in this function's scope"? Perl's use strict does something like this, for example. But I don't know that R has an equivalent of my.
EDIT: Thank you, I am aware of that I capitalized them differently. Indeed, the example was created specifically to illustrate this problem!
I want to know if there is a way that R can automatically warn me when I do this.
EDIT 2: Also, if Rkward or another IDE offers this functionality I'd like to know that too.
As far as I know, R does not provide a "use strict" mode. So you are left with two options:
1 - Ensure all your "strict" functions don't have globalenv as environment. You could define a nice wrapper function for this, but the simplest is to call local:
# Use "local" directly to control the function environment
f <- local( function(myvar) { return(myVar); }, as.environment(2))
f(3) # Error in f(3) : object 'myVar' not found
# Create a wrapper function "strict" to do it for you...
strict <- function(f, pos=2) eval(substitute(f), as.environment(pos))
f <- strict( function(myvar) { return(myVar); } )
f(3) # Error in f(3) : object 'myVar' not found
2 - Do a code analysis that warns you of "bad" usage.
Here's a function checkStrict that hopefully does what you want. It uses the excellent codetools package.
# Checks a function for use of global variables
# Returns TRUE if ok, FALSE if globals were found.
checkStrict <- function(f, silent=FALSE) {
vars <- codetools::findGlobals(f)
found <- !vapply(vars, exists, logical(1), envir=as.environment(2))
if (!silent && any(found)) {
warning("global variables used: ", paste(names(found)[found], collapse=', '))
return(invisible(FALSE))
}
!any(found)
}
And trying it out:
> myVar = 1
> f <- function(myvar) { return(myVar); }
> checkStrict(f)
Warning message:
In checkStrict(f) : global variables used: myVar
checkUsage in the codetools package is helpful, but doesn't get you all the way there.
In a clean session where myVar is not defined,
f <- function(myvar) { return(myVar); }
codetools::checkUsage(f)
gives
<anonymous>: no visible binding for global variable ‘myVar’
but once you define myVar, checkUsage is happy.
See ?codetools in the codetools package: it's possible that something there is useful:
> findGlobals(f)
[1] "{" "myVar" "return"
> findLocals(f)
character(0)
You need to fix the typo: myvar != myVar. Then it will all work...
Scope resolution is 'from the inside out' starting from the current one, then the enclosing and so on.
Edit Now that you clarified your question, look at the package codetools (which is part of the R Base set):
R> library(codetools)
R> f <- function(myVAR) { return(myvar) }
R> checkUsage(f)
<anonymous>: no visible binding for global variable 'myvar'
R>
Using get(x, inherits=FALSE) will force local scope.
myVar = 1
f2 <- function(myvar) get("myVar", inherits=FALSE)
f3 <- function(myvar){
myVar <- myvar
get("myVar", inherits=FALSE)
}
output:
> f2(8)
Error in get("myVar", inherits = FALSE) : object 'myVar' not found
> f3(8)
[1] 8
You are of course doing it wrong. Don't expect static code checking tools to find all your mistakes. Check your code with tests. And more tests. Any decent test written to run in a clean environment will spot this kind of mistake. Write tests for your functions, and use them. Look at the glory that is the testthat package on CRAN.
There is a new package modules on CRAN which addresses this common issue (see the vignette here). With modules, the function raises an error instead of silently returning the wrong result.
# without modules
myVar <- 1
f <- function(myvar) { return(myVar) }
f(2)
[1] 1
# with modules
library(modules)
m <- module({
f <- function(myvar) { return(myVar) }
})
m$f(2)
Error in m$f(2) : object 'myVar' not found
This is the first time I use it. It seems to be straightforward so I might include it in my regular workflow to prevent time consuming mishaps.
you can dynamically change the environment tree like this:
a <- 1
f <- function(){
b <- 1
print(b)
print(a)
}
environment(f) <- new.env(parent = baseenv())
f()
Inside f, b can be found, while a cannot.
But probably it will do more harm than good.
You can test to see if the variable is defined locally:
myVar = 1
f <- function(myvar) {
if( exists('myVar', environment(), inherits = FALSE) ) return( myVar) else cat("myVar was not found locally\n")
}
> f(2)
myVar was not found locally
But I find it very artificial if the only thing you are trying to do is to protect yourself from spelling mistakes.
The exists function searches for the variable name in the particular environment. inherits = FALSE tells it not to look into the enclosing frames.
environment(fun) = parent.env(environment(fun))
will remove the 'workspace' from your search path, leave everything else. This is probably closest to what you want.
#Tommy gave a very good answer and I used it to create 3 functions that I think are more convenient in practice.
strict
to make a function strict, you just have to call
strict(f,x,y)
instead of
f(x,y)
example:
my_fun1 <- function(a,b,c){a+b+c}
my_fun2 <- function(a,b,c){a+B+c}
B <- 1
my_fun1(1,2,3) # 6
strict(my_fun1,1,2,3) # 6
my_fun2(1,2,3) # 5
strict(my_fun2,1,2,3) # Error in (function (a, b, c) : object 'B' not found
checkStrict1
To get a diagnosis, execute checkStrict1(f) with optional Boolean parameters to show more ore less.
checkStrict1("my_fun1") # nothing
checkStrict1("my_fun2") # my_fun2 : B
A more complicated case:
A <- 1 # unambiguous variable defined OUTSIDE AND INSIDE my_fun3
# B unambiguous variable defined only INSIDE my_fun3
C <- 1 # defined OUTSIDE AND INSIDE with ambiguous name (C is also a base function)
D <- 1 # defined only OUTSIDE my_fun3 (D is also a base function)
E <- 1 # unambiguous variable defined only OUTSIDE my_fun3
# G unambiguous variable defined only INSIDE my_fun3
# H is undeclared and doesn't exist at all
# I is undeclared (though I is also base function)
# v defined only INSIDE (v is also a base function)
my_fun3 <- function(a,b,c){
A<-1;B<-1;C<-1;G<-1
a+b+A+B+C+D+E+G+H+I+v+ my_fun1(1,2,3)
}
checkStrict1("my_fun3",show_global_functions = TRUE ,show_ambiguous = TRUE , show_inexistent = TRUE)
# my_fun3 : E
# my_fun3 Ambiguous : D
# my_fun3 Inexistent : H
# my_fun3 Global functions : my_fun1
I chose to show only inexistent by default out of the 3 optional additions. You can change it easily in the function definition.
checkStrictAll
Get a diagnostic of all your potentially problematic functions, with the same parameters.
checkStrictAll()
my_fun2 : B
my_fun3 : E
my_fun3 Inexistent : H
sources
strict <- function(f1,...){
function_text <- deparse(f1)
function_text <- paste(function_text[1],function_text[2],paste(function_text[c(-1,-2,-length(function_text))],collapse=";"),"}",collapse="")
strict0 <- function(f1, pos=2) eval(substitute(f1), as.environment(pos))
f1 <- eval(parse(text=paste0("strict0(",function_text,")")))
do.call(f1,list(...))
}
checkStrict1 <- function(f_str,exceptions = NULL,n_char = nchar(f_str),show_global_functions = FALSE,show_ambiguous = FALSE, show_inexistent = TRUE){
functions <- c(lsf.str(envir=globalenv()))
f <- try(eval(parse(text=f_str)),silent=TRUE)
if(inherits(f, "try-error")) {return(NULL)}
vars <- codetools::findGlobals(f)
vars <- vars[!vars %in% exceptions]
global_functions <- vars %in% functions
in_global_env <- vapply(vars, exists, logical(1), envir=globalenv())
in_local_env <- vapply(vars, exists, logical(1), envir=as.environment(2))
in_global_env_but_not_function <- rep(FALSE,length(vars))
for (my_mode in c("logical", "integer", "double", "complex", "character", "raw","list", "NULL")){
in_global_env_but_not_function <- in_global_env_but_not_function | vapply(vars, exists, logical(1), envir=globalenv(),mode = my_mode)
}
found <- in_global_env_but_not_function & !in_local_env
ambiguous <- in_global_env_but_not_function & in_local_env
inexistent <- (!in_local_env) & (!in_global_env)
if(typeof(f)=="closure"){
if(any(found)) {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),":", paste(names(found)[found], collapse=', '),"\n"))}
if(show_ambiguous & any(ambiguous)) {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Ambiguous :", paste(names(found)[ambiguous], collapse=', '),"\n"))}
if(show_inexistent & any(inexistent)) {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Inexistent :", paste(names(found)[inexistent], collapse=', '),"\n"))}
if(show_global_functions & any(global_functions)){cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Global functions :", paste(names(found)[global_functions], collapse=', '),"\n"))}
return(invisible(FALSE))
} else {return(invisible(TRUE))}
}
checkStrictAll <- function(exceptions = NULL,show_global_functions = FALSE,show_ambiguous = FALSE, show_inexistent = TRUE){
functions <- c(lsf.str(envir=globalenv()))
n_char <- max(nchar(functions))
invisible(sapply(functions,checkStrict1,exceptions,n_char = n_char,show_global_functions,show_ambiguous, show_inexistent))
}
What works for me, based on #c-urchin 's answer, is to define a script which reads all my functions and then excludes the global environment:
filenames <- Sys.glob('fun/*.R')
for (filename in filenames) {
source(filename, local=T)
funname <- sub('^fun/(.*).R$', "\\1", filename)
eval(parse(text=paste('environment(',funname,') <- parent.env(globalenv())',sep='')))
}
I assume that
all functions and nothing else are contained in the relative directory ./fun and
every .R file contains exactly one function with an identical name as the file.
The catch is that if one of my functions calls another one of my functions, then the outer function has to also call this script first, and it is essential to call it with local=T:
source('readfun.R', local=T)
assuming of course that the script file is called readfun.R.

R specify function environment

I have a question about function environments in the R language.
I know that everytime a function is called in R, a new environment E
is created in which the function body is executed. The parent link of
E points to the environment in which the function was created.
My question: Is it possible to specify the environment E somehow, i.e., can one
provide a certain environment in which function execution should happen?
A function has an environment that can be changed from outside the function, but not inside the function itself. The environment is a property of the function and can be retrieved/set with environment(). A function has at most one environment, but you can make copies of that function with different environments.
Let's set up some environments with values for x.
x <- 0
a <- new.env(); a$x <- 5
b <- new.env(); b$x <- 10
and a function foo that uses x from the environment
foo <- function(a) {
a + x
}
foo(1)
# [1] 1
Now we can write a helper function that we can use to call a function with any environment.
with_env <- function(f, e=parent.frame()) {
stopifnot(is.function(f))
environment(f) <- e
f
}
This actually returns a new function with a different environment assigned (or it uses the calling environment if unspecified) and we can call that function by just passing parameters. Observe
with_env(foo, a)(1)
# [1] 6
with_env(foo, b)(1)
# [1] 11
foo(1)
# [1] 1
Here's another approach to the problem, taken directly from http://adv-r.had.co.nz/Functional-programming.html
Consider the code
new_counter <- function() {
i <- 0
function() {
i <<- i + 1
i
}
}
(Updated to improve accuracy)
The outer function creates an environment, which is saved as a variable. Calling this variable (a function) effectively calls the inner function, which updates the environment associated with the outer function. (I don't want to directly copy Wickham's entire section on this, but I strongly recommend that anyone interested read the section entitled "Mutable state". I suspect you could get fancier than this. For example, here's a modification with a reset option:
new_counter <- function() {
i <- 0
function(reset = FALSE) {
if(reset) i <<- 0
i <<- i + 1
i
}
}
counter_one <- new_counter()
counter_one()
counter_one()
counter_two <- new_counter()
counter_two()
counter_two()
counter_one(reset = TRUE)
I am not sure I completely track the goal of the question. But one can set the environment that a function executes in, modify the objects in that environment and then reference them from the global environment. Here is an illustrative example, but again I do not know if this answers the questioners question:
e <- new.env()
e$a <- TRUE
testFun <- function(){
print(a)
}
testFun()
Results in: Error in print(a) : object 'a' not found
testFun2 <- function(){
e$a <- !(a)
print(a)
}
environment(testFun2) <- e
testFun2()
Returns: FALSE
e$a
Returns: FALSE

Resources