R: how to replicate the <<- assignment with assign()? - r

I need to assign a variable in a function, whose name is a parameter of the function, and I need to access it later on, outside the function. I think <<- would do it in another situation, but since the name of the variable is dynamic, I think I need assign().
What I have at the moment
assignVar = function(varname) {
assign(varname,"blabla")
}
assignVar("foo")
foo
returns the usual Error: object 'foo' not found.
Is there some option to the assign function, or any other secret weapon, that I could use to do that? I have looked at the documentation, but I am still very confused about environments... (R beginner).

It's really important to understand the behavior of <<- before you use it. Once you understand that behavior, you will also note that you cannot use assign to achieve its behavior (at least not very easily). Here's a simple example:
a <- 1
f1 <- function(){
a <<- 2
NULL
}
f1()
a
# [1] 2
a <- 1
f2 <- function(){
a <- 2
f3 <- function(){
a <<- 3
}
f3()
NULL
}
f2()
a
# [1] 1
<<- works by assigning into the parent environment, if possible. If there is no existing a object (in this example) there, it goes up another level and repeats this until it reaches the global environment, where it will ultimately assign if no other lower environment worked. So in the above examples, f1() results in a change to the global environment but f2() does not. If you comment out the a<-2 line of f2, you do get a change in the global environment.
To achieve the same behavior using assign, you would need to write a much more complex function that loops through parent environments until it reaches the global environment. Regardless, having functions produce side effects is generally discouraged (due to introducing often unnecessary complexity to code and particularly when those side effects occur in the global environment).

I highly suggest you don't do this:
assignVar = function(varname, envir = globalenv()) {
assign(varname, "blabla", envir=envir)
invisible(NULL)
}
assignVar("foo", envir=globalenv())
foo
[1] "blabla"

Related

Why does rm inside a function not delete objects?

rel.mem <- function(nm) {
rm(nm)
}
I defined the above function rel.mem -- takes a single argument and passes it to rm
> ls()
[1] "rel.mem"
> x<-1:10
> ls()
[1] "rel.mem" "x"
> rel.mem(x)
> ls()
[1] "rel.mem" "x"
Now you can see what I call rel.mem x is not deleted -- I know this is due to the incorrect environment on which rm is being attempted.
What is a good fix for this?
Criteria for a good fix:
The caller should not have to pass the environment
The callee (rel.mem) should be able to determine the environment by using an R language facility (call stack inspection, aspects, etc.)
The interface of the function rel.mem should be kept simple -- idiot proof: call rel.mem -- then rel.mem takes it from there -- no need to pass environments.
NOTES:
As many commenters have pointed out that one easy fix is to pass the environment.
What I meant by a good fix [and I should have clarified it] is that the callee function (in this case rel.mem) is able to calculate/find out the environment when the caller was referring to and then remove the object from the right environment.
The type of reasoning in "2" can be done in other languages by inspecting the call stack -- for example in Java I would throw a dummy exception -- catch it and then parse the call stack. In other languages still I could use Aspect Oriented techniques. The question is can something like that be done in R?
As one commenter has suggested that there may be multiple objects with the same name and thus the "right" environment is meaningless -- as I've stated above that in other languages it is possible (sometimes with some creative trickery) to interpret the call-stack -- this may not be possible in R
As one commenter has suggested that rm(list=nm, envir = parent.frame()) will remove this from the parent environment. This is correct -- however I'm looking for something that will work for an arbitrary call depth.
The quick answer is that you're in a different environment - essentially picture the variables in a box: you have a box for the function and one for the Global Environment. You just need to tell rm where to find that box.
So
rel_mem <- function(nm) {
# State the environment
rm(list=nm, envir = .GlobalEnv )
}
x = 10
rel_mem("x")
Alternatively, you can use the pos argument, e.g.
rel_mem <- function(nm) {
rm(list=nm, pos=1 )
}
If you type search() you will see a vector of environments, the global is number 1.
Another two options are
envir = parent.frame() if you want to go one level up the call stack
Use inherits = TRUE to go up the call stack until you find something
In the above code, notice that I'm passing the object as a character - I'm passing the "x" not x. We can be clever and avoid this using the substitute function
rel_mem <- function(nm) {
rm(list = as.character(substitute(nm)), envir = .GlobalEnv )
}
To finish I'll just add that deleting things in the .GlobalEnv from a function is generally a bad idea.
Further resources:
Environments:http://adv-r.had.co.nz/Environments.html
Substitute function: http://adv-r.had.co.nz/Computing-on-the-language.html#capturing-expressions
If you are using another function to find the global objects within your function such as ls(), you must state the environment in it explicitly too:
rel_mem <- function(nm) {
# State the environment in both functions
rm(list = ls(envir = .GlobalEnv) %>% .[startsWith(., "plot_")], envir = .GlobalEnv)
}

R: IF object is TRUE then assign object NOT WORKING

I am trying to write a very basic IF statement in R and am stuck. I thought I'd find someone with the same problem, but I cant. Im sorry if this has been solved before.
I want to check if a variable/object has been assigned, IF TRUE I want to execute a function that is part of a R-package. First I wrote
FileAssignment <- function(x){
if(exists("x")==TRUE){
print("yes!")
x <- parse.vdjtools(x)
} else { print("Nope!")}
}
I assign a filename as x
FILENAME <- "FILENAME.txt"
I run the function
FileAssignment(FILENAME)
I use print("yes!") and print("Nope!") to check if the IF-Statement works, and it does. However, the parse.vdjtools(x) part is not assigned. Now I tested the same IF-statement outside of the function:
if(exists("FILENAME1")==TRUE){
FILENAME1 <- parse.vdjtools(FILENAME1)
}
This works. I read here that it might be because the function uses {} and the if-statement does too. So I should remove the brackets from the if-statement.
FileAssignment <- function(x){
if(exists("x")==TRUE)
x <- parse.vdjtools(x)
else { print("Nope!")
}
Did not work either.
I thought it might be related to the specific parse.vdjtools(x) function, so I just tried assigning a normal value to x with x <- 20. Also did not work inside the function, however, it does outside.
I dont really know what you are trying to acheive, but I wpuld say that the use of exists in this context is wrong. There is no way that the x cannot exist inside the function. See this example
# All this does is report if x exists
f <- function(x){
if(exists("x"))
cat("Found x!", fill = TRUE)
}
f()
f("a")
f(iris)
# All will be found!
Investigate file.exists instead? This is vectorised, so a vector of files can be investigated at the same time.
The question that you are asking is less trivial than you seem to believe. There are two points that should be addressed to obtain the desired behavior, and especially the first one is somewhat tricky:
As pointed out by #NJBurgo and #KonradRudolph the variable x will always exist within the function since it is an argument of the function. In your case the function exists() should therefore not check whether the variable x is defined. Instead, it should be used to verify whether a variable with a name corresponding to the character string stored in x exists.
This is achieved by using a combination of deparse() and
substitute():
if (exists(deparse(substitute(x)))) { …
Since x is defined only within the scope of the function, the superassignment operator <<- would be required to make a value assigned to x visible outside the function, as suggested by #thothai. However, functions should not have such side effects. Problems with this kind of programming include possible conflicts with another variable named x that could be defined in a different context outside the function body, as well as a lack of clarity concerning the operations performed by the function.
A better way is to return the value instead of assigning it to a variable.
Combining these two aspects, the function could be rewritten like this:
FileAssignment <- function(x){
if (exists(deparse(substitute(x)))) {
print("yes!")
return(parse.vdjtools(x))
} else {
print("Nope!")
return(NULL)}
}
In this version of the function, the scope of x is limited to the function body and the function has no side effects. The return value of FileAssignment(a) is either parse.vdjtools(a) or NULL, depending on whether a exists or not.
Outside the function, this value can be assigned to x with
x <- FileAssignment(a)

lapply() emptied list step by step while processing

First of all, excuse me for the bad title. I'm still so confused about this behavior, that I wasn't able to describe it; however I was able to reproduce it and broke it down to an (goofy) example.
Please, could you be so kind and explain why other.list appears to be full of NULLs after calling lapply()?
some.list <- rep(list(rnorm(1)),33)
other.list <- rep(list(), length = 33)
lapply(seq_along(some.list), function(i, other.list) {
other.list[[i]] <- some.list[[i]]
browser()
}, other.list)
I watched this in debugging mode in RStudio. For certain i, other.list[[i]] gets some.list[[i]] assigned, but it will be NULLed for the next iteration. I want to understand this behavior so bad!
The reason is that the assignment is taking place inside a function, and you've used the normal assignment operator <-, rather than the superassignment operator <<-. When inside a function scope, IOW when a function is executed, the normal assignment operator always assigns to a local variable in the evaluation environment that is created for that particular evaluation of that function (returned by a call to environment() from inside the function with fun=NULL). Thus, your global other.list variable, which is defined in the global environment (returned by globalenv()), will not be touched by such an assignment. The superassignment operator, on the other hand, will follow the closure environment chain (can be followed recursively via parent.env()) back until it finds a variable with the name on the LHS of the assignment, and then it assigns to that. The global environment is always at the base of the closure environment chain. If no such variable is found, the superassignment operator creates one in the global environment.
Thus, if you change <- to <<- in the assignment that takes place inside the function, you will be able to modify the global other.list variable.
See https://stat.ethz.ch/R-manual/R-devel/library/base/html/assignOps.html.
Here, I tried to make a little demo to demonstrate these concepts. In all my assignments, I'm assigning the actual environment that contains the variable being assigned to:
oldGlobal <- environment(); ## environment() is same as globalenv() in global scope
(function() {
newLocal1 <- environment(); ## creates a new local variable in this function evaluation's evaluation environment
print(newLocal1); ## <environment: 0x6014cbca8> (different for every evaluation)
oldGlobal <<- parent.env(environment()); ## target search hits oldGlobal in closure environment; RHS is same as globalenv()
newGlobal1 <<- globalenv(); ## target search fails; creates a new variable in the global environment
(function() {
newLocal2 <- environment(); ## creates a new local variable in this function evaluation's evaluation environment
print(newLocal2); ## <environment: 0x6014d2160> (different for every evaluation)
newLocal1 <<- parent.env(environment()); ## target search hits the existing newLocal1 in closure environment
print(newLocal1); ## same value that was already in newLocal1
oldGlobal <<- parent.env(parent.env(environment())); ## target search hits oldGlobal two closure environments up in the chain; RHS is same as globalenv()
newGlobal2 <<- globalenv(); ## target search fails; creates a new variable in the global environment
})();
})();
oldGlobal; ## <environment: R_GlobalEnv>
newGlobal1; ## <environment: R_GlobalEnv>
newGlobal2; ## <environment: R_GlobalEnv>
I haven't run your code, but two observations:
I usually avoid putting browser() as the last line inside a function because that gets treated as the return value
other.list does not get modified by your lapply. You need to understand the basics of environments and that any bindings you make inside lapply do not hold outside of it. It's a design feature and the whole point is that lapply can't have side effects - you should only use its return value. You can either use the <<- operator instead of <- though I don't recommend that, or you can use the assign function instead. Or you can do it properly the way lapply is meant to be used:
others.list <- lapply(seq_along(some.list), function(i, other.list) {
some.list[[i]]
})
Note that it's generally recommended to not make assignments inside lapply that change variables outside of it. lapply is meant to perform a function on every element and return a list, and that list should be all that lapply is used for

Mutating a variable in a closure [duplicate]

This question already has answers here:
Global and local variables in R
(3 answers)
Closed 8 years ago.
I'm pretty new to R, but coming from Scheme—which is also lexically scoped and has closures—I would expect being able to mutate outer variables in a closure.
E.g., in
foo <- function() {
s <- 100
add <- function() {
s <- s + 1
}
add()
s
}
cat(foo(), "\n") # prints 100 and not 101
I would expect foo() to return 101, but it actually returns 100:
$ Rscript foo.R
100
I know that Python has the global keyword to declare scope of variables (doesn't work with this example, though). Does R need something similar?
What am I doing wrong?
Update
Ah, is the problem that in add I am creating a new, local variable s that shadows the outer s? If so, how can I mutate s without creating a local variable?
Use the <<- operator for assignment in the add() function.
From ?"<<-":
The operators <<- and ->> are normally only used in functions, and cause a search to made through parent environments for an existing definition of the variable being assigned. If such a variable is found (and its binding is not locked) then its value is redefined, otherwise assignment takes place in the global environment. Note that their semantics differ from that in the S language, but are useful in conjunction with the scoping rules of R. See ‘The R Language Definition’ manual for further details and examples.
You can also use assign and define the scope precisely using the envir argument, works the same way as <<- in your add function in this case but makes your intention a little more clear:
foo <- function() {
s <- 100
add <- function() {
assign("s", s + 1, envir = parent.frame())
}
add()
s
}
cat(foo(), "\n")
Of course the better way for this kind of thing in R is to have your function return the variable (or variables) it modifies and explicitly reassigning them to the original variable:
foo <- function() {
s <- 100
add <- function(x) x + 1
s <- add(s)
s
}
cat(foo(), "\n")
Here is one more approach that can be a little safer than the assign or <<- approaches:
foo <- function() {
e <- environment()
s <- 100
add <- function() {
e$s <- e$s + 1
}
add()
s
}
foo()
The <<- assignment can cause problems if you accidentally misspell your variable name, it will still do something, but it will not be what you are expecting and can be hard to find the source of the problem. The assign approach can be tricky if you then want to move your add function to inside another function, or call it from another function. The best approach overall is to not have the functions modify variables outside their own scope and have the function return anything that is important. But when that is not possible, the above method uses lexical scoping to access the environment e, then assigns into the environment so it will always assign specifically into that function, never above or below.

how to isolate a function

How can I ensure that when a function is called it is not allowed to grab variables from the global environment?
I would like the following code to give me an error. The reason is because I might have mistyped z (I wanted to type y).
z <- 10
temp <- function(x,y) {
y <- y + 2
return(x+z)
}
> temp(2,1)
[1] 12
I'm guessing the answer has to do with environments, but I haven't understood those yet.
Is there a way to make my desired behavior default (e.g. by setting an option)?
> library(codetools)
> checkUsage(temp)
<anonymous>: no visible binding for global variable 'z'
The function doesn't change, so no need to check it each time it's used. findGlobals is more general, and a little more cryptic. Something like
Filter(Negate(is.null), eapply(.GlobalEnv, function(elt) {
if (is.function(elt))
findGlobals(elt)
}))
could visit all functions in an environment, but if there are several functions then maybe it's time to think about writing a package (it's not that hard).
environment(temp) = baseenv()
See also http://cran.r-project.org/doc/manuals/R-lang.html#Scope-of-variables and ?environment.
environment(fun) = parent.env(environment(fun))
(I'm using 'fun' in place of your function name 'temp' for clarity)
This will remove the "workspace" environment (.GlobalEnv) from the search path and leave everything else (eg all packages).

Resources