I have written a function that sources files that contain scripts for other functions and stores these functions in an alternative environment so that they aren't cluttering up the global environment. The code works, but contains three instances of eval(parse(...)):
# sourceFunctionHidden ---------------------------
# source a function and hide the function from the global environment
sourceFunctionHidden <- function(functions, environment = "env", ...) {
if (environment %in% search()) {
while (environment %in% search()) {
if (!exists("counter", inherits = F)) counter <- 0
eval(parse(text = paste0("detach(", environment, ")")))
counter <- counter + 1
}
cat("detached", counter, environment, "s\n")
} else {cat("no", environment, "attached\n")}
if (!environment %in% ls(.GlobalEnv, all.names = T)) {
assign(environment, new.env(), pos = .GlobalEnv)
cat("created", environment, "\n")
} else {cat(environment, "already exists\n")}
sapply(functions, function(func) {
source(paste0("C:/Users/JT/R/Functions/", func, ".R"))
eval(parse(text = paste0(environment, "$", func," <- ", func)))
cat(func, "created in", environment, "\n")
})
eval(parse(text = paste0("attach(", environment, ")")))
cat("attached", environment, "\n\n")
}
Much has been written about the sub-optimality of the eval(parse(...)) construction (see here and here). However, the discussions that I've found mostly deal with alternate strategies for subsetting. The first and third instances of eval(parse(...)) in my code don't involve subsetting (the second instance might be related to subsetting).
Is there a way to call new.env(...), [environment name]$[function name] <- [function name], and attach(...) without resorting to eval(parse(...))? Thanks.
N.B.: I don't want to change the names of my functions to .name to hide them in the global environment
For what its worth, the function source actually uses eval(parse(...)), albeit in a somewhat subtle way. First, .Internal(parse(...)) is used to create expressions, which after more processing are later passed to eval. So eval(parse(...)) seems to be good enough for the R core team in this instance.
That said, you don't need to jump through hoops to source functions into a new environment. source provides an argument local that can be used for precisely this.
local: TRUE, FALSE or an environment, determining where the parsed expressions are evaluated.
An example:
env = new.env()
source('test.r', local = env)
testing it works:
env$test('hello', 'world')
# [1] "hello world"
ls(pattern = 'test')
# character(0)
And an example test.r file to use this on:
test = function(a,b) paste(a,b)
If you want to keep it off global_env, put it into a package. It's common for people in the R community to put a bunch of frequently used helper functions into their own personal package.
tl;dr: The right way to convert quoted strings to object names is to use assign() and get(). See this post.
The long answer: The answer from #dww about being able to source() directly to a specific environment led me to change the second instance of eval(parse(...)) as follows:
# old version
source(paste0("C:/Users/JT/R/Functions/", func, ".R"))
eval(parse(text = paste0(environment, "$", func," <- ", func)))
# new version
source(
paste0("C:/Users/JT/R/Functions/", func, ".R"),
local = get(environment)
)
The answer from #dww also got me to exploring attach(). attach() has an argument that allows specification of the environment to which to direct the output. This led me to change the third instance of eval(parse(...)) (below). Note the use of get() to convert the "env" that comes from environment to the unquoted env that attach() requires.
# old version
eval(parse(text = paste0("attach(", environment, ")")))
# new version
attach(get(environment), name = environment)
Finally, at some point in this process I was reminded that rm() has a character.only argument. detach() accepts the same argument, so I changed the second instance of eval(parse()) as below:
# old version
eval(parse(text = paste0("detach(", environment, ")")))
# new version
detach(environment, character.only = T)
So my new code is:
# sourceFunctionHidden ---------------------------
# source a function and hide the function from the global environment
sourceFunctionHidden <- function(functions, environment = "env", ...) {
if (environment %in% search()) {
while (environment %in% search()) {
if (!exists("counter", inherits = F)) counter <- 0
detach(environment, character.only = T)
counter <- counter + 1
}
cat("detached", counter, environment, "s\n")
} else {cat("no", environment, "attached\n")}
if (!environment %in% ls(.GlobalEnv, all.names = T)) {
assign(environment, new.env(), pos = .GlobalEnv)
cat("created", environment, "\n")
} else {cat(environment, "already exists\n")}
sapply(functions, function(func) {
source(
paste0("C:/Users/JT/R/Functions/", func, ".R"),
local = get(environment)
)
cat(func, "created in", environment, "\n")
})
attach(get(environment), name = environment)
cat("attached", environment, "\n\n")
}
Related
I need to access (i.e., read and save) the items of the environment I'm working in. I have written the following function to save all objects in my (global) environment:
save_vars <- function(list.of.vars = NULL,
prefix = "StatusQuo",
path = "data") {
if(is.null(list.of.vars)) list.of.vars <- ls()
date_time <- Sys.time()
if (!is.null(path))
path <- paste0(path, "/")
file_name <- paste0(path, prefix, "_", date_time, ".RData")
save(list = list.of.vars, file = file_name)
}
The idea was that if no list.of.vars argument is passed to the function, using ls(), the function accesses the variables of the environment calling save_vars. However, it only saves the variables within the scope of the function itself. I know I can call the function as save_vars(ls()) to do the job, but is there a neater way around it?
Probably cleanest to pass the environment:
fun <- function(envir = parent.frame()) ls(envir = envir)
fun()
This lists the objects in the caller but also lets the user change which environment is used. For example, they could force the global environment to be used:
fun(.GlobalEnv)
Here is a toy example to illustrate my problem.
library(foreach)
library(doMC)
registerDoMC(cores=2)
foreach(i = 1:2) %dopar%{
i + 2
}
[[1]]
[1] 3
[[2]]
[1] 4
So far so good...
But if the code i + 2 is saved in the file addition.R and that I call that file using source() then
> foreach(i = 1:2) %dopar%{
+ source("addition.R")
+ }
Error in { : task 1 failed - "object 'i' not found"
I cannot fully reproduce your toy, but I had a smiliar problem, which I was able to solve by:
source(file, local = TRUE)
which should parse the source in the local environment, i.e. recognizing i.
The comment by NiceE and the answer by Sosel already address this; when calling source(file) it defaults to source(file, local = FALSE), which means that the code in the file sourced is evaluating in the global environment ("user's workspace") and there is, cf. ?source. Note that there is no variable i in the global environment. The solution is to make sure the file sourced in the environment that calls it, i.e. to use source(file, local = TRUE).
Solution:
library("foreach")
y <- foreach(i = 1:2) %dopar% {
i + 2
}
str(y)
doMC::registerDoMC(cores = 2L)
y <- foreach(i = 1:2) %dopar% {
source("addition.R", local = TRUE)
}
str(y)
Example of the same problem with a for() loop:
The fact that source() is evaluated in the global environment which is different from the calling environment where i lives can also be illustrated using a regular for loop by running the for loop in another environment than the global, e.g. inside a function or by:
local({
for(i in 1:2) {
source("addition.R")
}
})
which gives:
Error in eval(ei, envir) : object 'i' not found
Now, the reason why the above foreach(i = 1:2) %dopar% { source("addition.R") } works with registerDoSEQ() if and only if called from the global environment, is that then the foreach iteration is evaluated in the calling environment, which is the global environment, which is the environment that source() uses. However, if one used local(foreach(i = 1:2) %dopar% { ... }) also this fails analoguously to the above local(for(i in 1:2) { ... }) call.
In conclusion: nothing magic happens, but to understand it is a bit tedious.
I finally solved the problem by converting the source("addition.R") to a function and simply passing the variables into it. I don't know why but the suggested solutions based on source(file, local = TRUE) does not work.
I have a variable in my global environment called myList. I have a function that modifies myList and re-assigns it to the global environment called myFunction. I only want myList to be modified by myFunction. Is there a way to prevent any other function from modifying myList?
For background, I am building a general tool for R users. I don't want users of the tool to be able to define their own function to modify myList. I also don't want to myself to be able to modify myList with a function I may write in the future.
I have a potential solution, but I don't like it. When the tool is executed, I could examine the text of every function defined by a user and search for the text that will assign myList to the global environment. I don't like the fact that I need to search over all functions.
Does anyone know if what I am looking for is implementable in R? Thanks for any help that can be provided.
For a reproducible example. I need code that will make the following example possible:
assign('myList', list(), envir = globalenv())
myFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
userFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
myFunction() # I need some code that will allow this function to run successfully
userFunction() # and cause an error when this function runs
Sounds like you need the modules package.
Basically, each unit of code has its own scope.
e.g.
# install.packages("modules")
# Load library
library("modules")
# Create a basic module
m <- module({
.myList <- list()
myFunction <- function() {
.myList <<- c(.myList, 'test')
}
get <- function() .myList
})
# Accessor
m$get()
# list()
# Your function
m$myFunction()
# Modification
m$get()
# [[1]]
# [1] "test"
Note, we tweaked the example slightly by changing the variable name to .myList from myList. So, we'll need to update that in the userfunction()
userFunction <- function() {
.myList <- c(.myList, 'test')
}
Running this, we now get:
userFunction()
# Error in userFunction() : object '.myList' not found
As desired.
For more detailed examples see modules vignette.
The alternative is you can define an environment (new.env()) and then lock it after you have loaded myList.
This is all around a bad idea. Beginning with assignment into the global environment (I'd never use a package that does this) to surprising your users. You should probably just use S4 or reference classes.
Anyway, you can lock the bindings (or environment if you followed better practices). You wouldn't stop an advanced user with that, but they would at least know that you don't want them to change the object.
createLocked <- function(x, name, env) {
assign(name, x, envir = env)
lockBinding(name, env)
invisible(NULL)
}
createLocked(list(), "myList", globalenv())
myFunction <- function() {
unlockBinding("myList", globalenv())
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
lockBinding("myList", globalenv())
invisible(NULL)
}
userFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
myFunction() # runs successfully
userFunction()
#Error in assign("myList", myList, envir = globalenv()) :
# cannot change value of locked binding for 'myList'
I am trying to get my head around the Snowfall library and its usage.
Having writing a simulation that makes use of environments, I encountered the following issue. If I source a file to load functions within the parallel mode, the function seems to use a different environment than when I declare the function within parallel mode direclty.
To make things a little bit more clear, lets consider the following two scripts:
q_func.R declares the function
foo.bar <- function(x, envname) assign("val", x, envir = get(envname))
# assigns the value x to the variable "val" in the environment envname
q_snowfall.R main function that uses snowfall
library(snowfall)
SnowFunc <- function(envname) {
# load the functions
# Option 1 not working
source("q_func.R")
# Option 2 working...
# foo.bar <- function(x, envname) assign("val", x, envir = get(envname))
# create the new environment
assign(envname, new.env())
# use the function as declared in q_func.R
# to assign random numbers to the new env
foo.bar(x = rnorm(1), envname = envname)
# return the environment including the random values
return(get("val", envir = get(envname)))
}
sfInit(parallel = TRUE, cpus = 2)
# create environment 'a' and 'b' that each will get a new variable
# called 'val' that gets assigned a random value
envs <- c("a", "b")
result <- sfClusterApplyLB(envs, SnowFunc)
sfStop()
If I execute the script "q_snowfall.R" I get the error
Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: object 'a' not found
However, if I use the second option (declaring the function within the SnowFunc-function the error disappears.
Do you know how Snowfall handles the different environments? Or do you even have a solution for the issue. (note that 'q_func.R' actually takes some 100 lines of code, therefore I would prefer to have it in a separate file, thus the "keep option 2" is not a solution!)
Thank you very much!
Edit
If I change all get(envname) to get(envname, envir = globalenv()) it seems to work. But it seems to me that this is more or less a workaround and not a very snowfall-like solution.
I think the issue is not with snowfall but with the fact that you're passing the environment by name (as character). You don't need to change all occurences of get, and having it look in globalEnv may indeed be unsafe.
It is sufficient to change the get call in foo.bar to look in parent.frame() instead (i.e., the environment from which foo.bar was called). The following worked on my machine.
new q_func.R
foo.bar <- function(x, envname) assign("val", x, envir=get(envname,
pos=parent.frame()))
(not so) new q_snowfall.R
library(snowfall)
SnowFunc <- function(envname) {
assign(envname, new.env())
foo.bar(x = rnorm(1), envname = envname)
return(get("val", envir = get(envname)))
}
source("q_func.R")
sfInit(parallel = TRUE, cpus = 2)
sfExport("foo.bar")
envs <- c("a", "b")
result <- sfClusterApplyLB(envs, SnowFunc)
sfStop()
Note also that I source'd before starting the cluster and used sfExport to export foo.bar to each node.
Here is a toy example to illustrate my problem.
library(foreach)
library(doMC)
registerDoMC(cores=2)
foreach(i = 1:2) %dopar%{
i + 2
}
[[1]]
[1] 3
[[2]]
[1] 4
So far so good...
But if the code i + 2 is saved in the file addition.R and that I call that file using source() then
> foreach(i = 1:2) %dopar%{
+ source("addition.R")
+ }
Error in { : task 1 failed - "object 'i' not found"
I cannot fully reproduce your toy, but I had a smiliar problem, which I was able to solve by:
source(file, local = TRUE)
which should parse the source in the local environment, i.e. recognizing i.
The comment by NiceE and the answer by Sosel already address this; when calling source(file) it defaults to source(file, local = FALSE), which means that the code in the file sourced is evaluating in the global environment ("user's workspace") and there is, cf. ?source. Note that there is no variable i in the global environment. The solution is to make sure the file sourced in the environment that calls it, i.e. to use source(file, local = TRUE).
Solution:
library("foreach")
y <- foreach(i = 1:2) %dopar% {
i + 2
}
str(y)
doMC::registerDoMC(cores = 2L)
y <- foreach(i = 1:2) %dopar% {
source("addition.R", local = TRUE)
}
str(y)
Example of the same problem with a for() loop:
The fact that source() is evaluated in the global environment which is different from the calling environment where i lives can also be illustrated using a regular for loop by running the for loop in another environment than the global, e.g. inside a function or by:
local({
for(i in 1:2) {
source("addition.R")
}
})
which gives:
Error in eval(ei, envir) : object 'i' not found
Now, the reason why the above foreach(i = 1:2) %dopar% { source("addition.R") } works with registerDoSEQ() if and only if called from the global environment, is that then the foreach iteration is evaluated in the calling environment, which is the global environment, which is the environment that source() uses. However, if one used local(foreach(i = 1:2) %dopar% { ... }) also this fails analoguously to the above local(for(i in 1:2) { ... }) call.
In conclusion: nothing magic happens, but to understand it is a bit tedious.
I finally solved the problem by converting the source("addition.R") to a function and simply passing the variables into it. I don't know why but the suggested solutions based on source(file, local = TRUE) does not work.