R Programming - creates variable in in the environment it was called - r

I have a function who's task is to create a variable in the parent object. What I want is for the function to create the variable at the level at which it's called.
createVariable <- function(var.name, var.value) {
assign(var.name,var.value,envir=parent.frame())
}
# Works
testFunc <- function() {
createVariable("testVar","test")
print(testVar)
}
# Doesn't work
testFunc2 <- function() {
testFunc()
print(testVar)
}
> testFunc()
[1] "test"
> testFunc2()
[1] "test"
Error in print(testVar) : object 'testVar' not found
I'm wondering if there's any way to do this without creating the variable in the global environment scope.
Edit: Is there also a way to unit test that a variable has been created?

Try this:
createVariable <- function(var.name, var.value) {
assign(var.name,var.value,envir=parent.env(environment()))
}
Edit:
Some more details here and here.
With the initial solution, the variable is created in the global env because parent.env is the environment in which the function is defined and the createVariable function is defined in the global environment.
You might also want to try assign(var.name,var.value,envir=as.environment(sys.frames()[[1]])), which will create it in the highest test function calling createVariable in your example (first one on the call stack), in that case however, you will need to remove print(testVar) from testFunc when you call testFunc2 because the variable only be created in the environment of testFunc2, not testFunc. I don't know if that's what you mean by at the level at which it's called.
If you run this:
createVariable <- function(var.name, var.value) {
assign(var.name,var.value,envir=as.environment(sys.frames()[[1]]))
print("creating")
}
testFunc <- function() {
createVariable("testVar","test")
print("func1")
print(exists("testVar"))
}
testFunc2 <- function() {
testFunc()
print("func2")
print(exists("testVar"))
}
testFunc()
testFunc2()
You get
> testFunc()
[1] "creating"
[1] "func1"
[1] TRUE
> testFunc2()
[1] "creating"
[1] "func1"
[1] FALSE
[1] "func2"
[1] TRUE
Which means testVar is in testFun2's environment, not in testFunc's. Creating a new environment as others say might be safer.

You need the parent environment to do this, not the calling environment:
createVariable <- function(var.name, var.value) {
#use parent.env(environment())
assign(var.name,var.value,envir=parent.env(environment()))
}
> testFunc()
[1] "test"
> testFunc2()
[1] "test"
[1] "test"

Why do you want to do this? Using assign can lead to hard to find bugs and other problems.
What might be a better approach is to create a new environment just before calling your function of interest. Then in your function assign into that new environment (best if the environment is passed to the function, but can use lexical scoping as well to access it). Then when the function returns you will have the new variable in the environment:
createVariable <- function(var.name, var.value, env) {
env[[var.name]] <- var.value
}
testfunc <- function() {
tmpenv <- new.env()
createVariable('x', 1:10, tmpenv)
print(ls())
print(ls(env=tmpenv))
print(tmpenv$x)
}
If createVariable were defined inside of testfunc then it could access tmpenv directly without needing it passed down (but passing down is safest when possible).
This version of createVariable could even be used more directly to assign in the environment of the calling function (but this becomes more dangerous, too easy to overwrite something in the current environment, or trying to access something by the wrong name due to a small typo):
testfunc2 <- function() {
createVariable('y', 5:1, environment())
print(ls())
print(y)
}

If you create a new environment and assign the value to it:
my.env <- new.env()
my.env$my.object <- runif(1)
Then call it using get:
> my.object # not found in global env
Error: object 'my.object' not found
> get("my.object", envir = my.env)
[1] 0.07912637
For your function:
createVariable <- function(env.name, var.name, var.value) {
env.name <- new.env()
assign(var.name, var.value, envir = env.name)
}

Related

save(list = ls()) Inside function not working with Assign variables

I'm trying to use ls() with variables I've brought into a function's environment from the parent function, or the global environment. I've seen this:
How to search an environment using ls() inside a function?
And this:
rm(list = ls()) doesn't work inside a function. Why?
But I still have this problem: my ls() call is not returning the assign variables that I know are inside that function's environment. I'm not trying to get the Global Environment, I want my function's environment. What am I missing?
g = function() {
e = environment()
get("f",envir = parent.env(e))
print(f)
save(list=ls(envir = e),file = "U:/GitHub/FunctionDebug.RData")
}
h = function() {
g()
}
f <- 1
h()
[1] 1 # So I know that my variable is seen by the function!
When I call load the file back into my Global Env, I get an empty namespace (other than e, the function environment. This only happens for variables that I've assigned from another function's environment. Why?
load("U:/GitHub/FunctionDebug.RData")
> ls()
[1] "e"

Understanding R function lazy evaluation

I'm having a little trouble understanding why, in R, the two functions below, functionGen1 and functionGen2 behave differently. Both functions attempt to return another function which simply prints the number passed as an argument to the function generator.
In the first instance the generated functions fail as a is no longer present in the global environment, but I don't understand why it needs to be. I would've thought it was passed as an argument, and is replaced with aNumber in the namespace of the generator function, and the printing function.
My question is: Why do the functions in the list list.of.functions1 no longer work when a is not defined in the global environment? (And why does this work for the case of list.of.functions2 and even list.of.functions1b)?
functionGen1 <- function(aNumber) {
printNumber <- function() {
print(aNumber)
}
return(printNumber)
}
functionGen2 <- function(aNumber) {
thisNumber <- aNumber
printNumber <- function() {
print(thisNumber)
}
return(printNumber)
}
list.of.functions1 <- list.of.functions2 <- list()
for (a in 1:2) {
list.of.functions1[[a]] <- functionGen1(a)
list.of.functions2[[a]] <- functionGen2(a)
}
rm(a)
# Throws an error "Error in print(aNumber) : object 'a' not found"
list.of.functions1[[1]]()
# Prints 1
list.of.functions2[[1]]()
# Prints 2
list.of.functions2[[2]]()
# However this produces a list of functions which work
list.of.functions1b <- lapply(c(1:2), functionGen1)
A more minimal example:
functionGen1 <- function(aNumber) {
printNumber <- function() {
print(aNumber)
}
return(printNumber)
}
a <- 1
myfun <- functionGen1(a)
rm(a)
myfun()
#Error in print(aNumber) : object 'a' not found
Your question is not about namespaces (that's a concept related to packages), but about variable scoping and lazy evaluation.
Lazy evaluation means that function arguments are only evaluated when they are needed. Until you call myfun it is not necessary to evaluate aNumber = a. But since a has been removed then, this evaluation fails.
The usual solution is to force evaluation explicitly as you do with your functionGen2 or, e.g.,
functionGen1 <- function(aNumber) {
force(aNumber)
printNumber <- function() {
print(aNumber)
}
return(printNumber)
}
a <- 1
myfun <- functionGen1(a)
rm(a)
myfun()
#[1] 1

Restrict which functions can modify an object

I have a variable in my global environment called myList. I have a function that modifies myList and re-assigns it to the global environment called myFunction. I only want myList to be modified by myFunction. Is there a way to prevent any other function from modifying myList?
For background, I am building a general tool for R users. I don't want users of the tool to be able to define their own function to modify myList. I also don't want to myself to be able to modify myList with a function I may write in the future.
I have a potential solution, but I don't like it. When the tool is executed, I could examine the text of every function defined by a user and search for the text that will assign myList to the global environment. I don't like the fact that I need to search over all functions.
Does anyone know if what I am looking for is implementable in R? Thanks for any help that can be provided.
For a reproducible example. I need code that will make the following example possible:
assign('myList', list(), envir = globalenv())
myFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
userFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
myFunction() # I need some code that will allow this function to run successfully
userFunction() # and cause an error when this function runs
Sounds like you need the modules package.
Basically, each unit of code has its own scope.
e.g.
# install.packages("modules")
# Load library
library("modules")
# Create a basic module
m <- module({
.myList <- list()
myFunction <- function() {
.myList <<- c(.myList, 'test')
}
get <- function() .myList
})
# Accessor
m$get()
# list()
# Your function
m$myFunction()
# Modification
m$get()
# [[1]]
# [1] "test"
Note, we tweaked the example slightly by changing the variable name to .myList from myList. So, we'll need to update that in the userfunction()
userFunction <- function() {
.myList <- c(.myList, 'test')
}
Running this, we now get:
userFunction()
# Error in userFunction() : object '.myList' not found
As desired.
For more detailed examples see modules vignette.
The alternative is you can define an environment (new.env()) and then lock it after you have loaded myList.
This is all around a bad idea. Beginning with assignment into the global environment (I'd never use a package that does this) to surprising your users. You should probably just use S4 or reference classes.
Anyway, you can lock the bindings (or environment if you followed better practices). You wouldn't stop an advanced user with that, but they would at least know that you don't want them to change the object.
createLocked <- function(x, name, env) {
assign(name, x, envir = env)
lockBinding(name, env)
invisible(NULL)
}
createLocked(list(), "myList", globalenv())
myFunction <- function() {
unlockBinding("myList", globalenv())
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
lockBinding("myList", globalenv())
invisible(NULL)
}
userFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
myFunction() # runs successfully
userFunction()
#Error in assign("myList", myList, envir = globalenv()) :
# cannot change value of locked binding for 'myList'

In a function, is it possible to `return( eval ( expr ) )`

I am trying to do something like a global assignment within a function (e.g. with ... <<- ...), using the following code
test_function = function(){
return(eval(parse(text = "test <- 4^2")))
}
test_function()
Which doesn't assign 16 to test in the environment test_function is called.
However
test_function = function(){
return(expression(test <- 4^2))
}
eval(test_function())
does!
Is there anyway of doing the former without resorting to the latter?
Well, I would be careful. if you did just
test_function = function(){
test <- 4^2
}
that value would not be in the global environment either, and that's essentially what you're doing in your first function. Note that
test_function = function(){
return(eval(parse(text = "test <- 4^2")))
}
print(test_function())
# [1] 16
returns 16 so the assignment is happening in the function scope just as expected and being returned. There's no reason to think that would be have any differently. If you want to evaluate in the parent scope, then be explicit about it
test_function = function(){
return(eval(parse(text = "test <- 4^2"), parent.frame()))
}
test_function()
or if you want to always operate in the global environment, specify that
test_function = function(){
return(eval(parse(text = "test <- 4^2"), globalenv())
}
test_function()
But really this seems like a poor design decision. It's not polite for functions to have global side effects like that. Make sure this is absolutely necessary for your application and you have no other options.
eval.parent may be a "safer" approach, if you want to assign into the calling framing of test_function.
test_function = function(){
eval.parent(quote(test <- 4^2))
}
test_function()
test
# [1] 16
You could use assign() with the assignment to the parent frame
> test_function <- function(){
assign("test", 4^2, parent.frame())
}
> test_function()
> test
[1] 16

Get the attribute of a packaged function from within itself

Suppose we have this functions in a R package.
prova <- function() {
print(attr(prova, 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
'myattr<-' <- function(x, value) {
attr(x, 'myattr') <- value
x
}
myattr <- function(x) attr(x, 'myattr')
So, I install the package and then I test it. This is the result:
prova()
# NULL
# NULL
myattr(prova) <- 'ciao' # setting 'ciao' for 'myattr' attribute
prova()
# NULL
# NULL # Why NULL here ?
myattr(prova)
# [1] "ciao"
attr(prova, 'myattr')
# [1] "ciao"
The question is: how to get the attribute of the function from within itself?
Inside the function itself I cannot get its attribute, as demonstrated by the example.
I suppose that the solution will be of the serie "computing on the language" (match.call()[[1L]], substitute, environments and friends). Am I wrong?
I think that the important point here is that this function is in a package (so, it has its environment and namespace) and I need its attribute inside itself, in the package, not outside.
you can use get with the envir argument.
prova <- function() {
print(attr(get("prova", envir=envir.prova), 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
eg:
envir.prova <- environment()
prova()
# NULL
# NULL
myattr(prova) <- 'ciao'
prova()
# [1] "ciao"
# [1] "ciao"
Where envir.prova is a variable whose value you set to the environment in which prova is defined.
Alternatively you can use get(.. envir=parent.frame()), but that is less reliable as then you have to track the calls too, and ensure against another object with the same name between the target environment and the calling environment.
Update regarding question in the comments:
regarding using parent.frame() versus using an explicit environment name: parent.frame, as the name suggests, goes "up one level." Often, that is exactly where you want to go, so that works fine. And yet, even when your goal is get an object in an environment further up, R searches up the call stack until it finds the object with the matching name. So very often, parent.frame() is just fine.
HOWEVER if there are multiple calls between where you are invoking parent.frame() and where the object is located AND in one of the intermediary environments there exists another object with the same name, then R will stop at that intermediary environment and return its object, which is not the object you were looking for.
Therefore, parent.frame() has an argument n (which defaults to 1), so that you can tell R to begin it's search at n levels back.
This is the "keeping track" that I refer to, where the developer has to be mindful of the number of calls in between. The straightforward way to go about this is to have an n argument in every function that is calling the function in question, and have that value default to 1. Then for the envir argument, you use: get/assign/eval/etc (.. , envir=parent.frame(n=n) )
Then if you call Func2 from Func1, (both Func1 and Func2 have an n argument), and Func2 is calling prova, you use:
Func1 <- function(x, y, ..., n=1) {
... some stuff ...
Func2( <some, parameters, etc,> n=n+1)
}
Func2 <- function(a, b, c, ..., n=1) {
.... some stuff....
eval(quote(prova()), envir=parent.frame(n=n) )
}
As you can see, it is not complicated but it is * tedious* and sometimes what seems like a bug creeps in, which is simply forgetting to carry the n over.
Therefore, I prefer to use a fixed variable with the environment name.
The solution that I found is:
myattr <- function(x) attr(x, 'myattr')
'myattr<-' <- function(x, value) {
# check that x is a function (e.g. the prova function)
# checks on value (e.g. also value is a function with a given precise signature)
attr(x, 'myattr') <- value
x
}
prova <- function(..., env = parent.frame()) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], env)
# print(eval(as.call(c(myattr, this)), env)) # alternative
print(myattr(this))
# print(attr(this, 'myattr')
invisible(TRUE)
}
I want to thank #RicardoSaporta for the help and the clarification about keeping tracks of the calls.
This solution doesn't work when e.g. myattr(prova) <- function() TRUE is nested in func1 while prova is called in func2 (that it's called by func1). Unless you do not properly update its parameter env ...
For completeness, following the suggestion of #RicardoSaporta, I slightly modified the prova function:
prova <- function(..., pos = 1L) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], parent.frame(n = pos)
print(myattr(this))
# ...
}
This way, it works also when nested, if the the correct pos parameter is passed in.
With this modification it is easier to go to fish out the environment in which you set the attribute on the function prova.
myfun1 <- function() {
myattr(prova) <- function() print(FALSE)
myfun2(n = 2)
}
myfun2 <- function(n) {
prova(pos = n)
}
myfun1()
# function() print(FALSE)
# <environment: 0x22e8208>

Resources