Say I have a matlab function:
function y = myfunc(x)
persistent a
a = x*10
...
What is the equivalent statement in R for the persistent a statement? <<- or assign()?
Here's one way:
f <- local({ x<-NULL; function(y) {
if (is.null(x)) { # or perhaps !missing(y)
x <<- y+1
}
x
}})
f(3) # First time, x gets assigned
#[1] 4
f() # Second time, old value is used
#[1] 4
What happens is that local creates a new environment around the x<-NULL and the function declaration. So inside the function, it can get to the x variable and assign to it using <<-.
You can find the environment for a function like this:
e <- environment(f)
ls(e) # "x"
Related
For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"
I first defined new variable x, then created function that require x within its body (not as argument). See code below
x <- c(1,2,3)
f1 <- function() {
x^2
}
rm(x)
f2 <- function() {
x <- c(1,2,3)
f1()
}
f(2)
Error in f1() : object 'x' not found
When I removed x, and defined new function f2 that first define x and then execute f1, it shows objects x not found.
I just wanted to know why this is not working and how I can overcome this problem. I do not want x to be name as argument in f1.
Please provide appropriate title because I do not know what kind of problem is this.
You could use a closure to make an f1 with the desired properties:
makeF <- function(){
x <- c(1,2,3)
f1 <- function() {
x^2
}
f1
}
f1 <- makeF()
f1() #returns 1 4 9
There is no x in the global scope but f1 still knows about the x in the environment that it was defined in.
In short: Your are expecting dynamic scoping but are a victim of R's lexical scoping:
dynamic scoping = the enclosing environment of a command is determined during run-time
lexical scoping = the enclosing environment of a command is determined at "compile time"
To understand the lookup path of your variable x in the current and parent environments try this code.
It shows that both functions do not share the environment in with x is defined in f2 so it can't never be found:
# list all parent environments of an environment to show the "search path"
parents <- function(env) {
while (TRUE) {
name <- environmentName(env)
txt <- if (nzchar(name)) name else format(env)
cat(txt, "\n")
if (txt == "R_EmptyEnv") break
env <- parent.env(env)
}
}
x <- c(1,2,3)
f1 <- function() {
print("f1:")
parents(environment())
x^2
}
f1() # works
# [1] "f1:"
# <environment: 0x4ebb8b8>
# R_GlobalEnv
# ...
rm(x)
f2 <- function() {
print("f2:")
parents(environment())
x <- c(1,2,3)
f1()
}
f2() # does not find "x"
# [1] "f2:"
# <environment: 0x47b2d18>
# R_GlobalEnv
# ...
# [1] "f1:"
# <environment: 0x4765828>
# R_GlobalEnv
# ...
Possible solutions:
Declare x in the global environment (bad programming style due to lack of encapsulation)
Use function parameters (this is what functions are made for)
Use a closure if x has always the same value for each call of f1 (not for beginners). See the other answer from #JohnColeman...
I strongly propose using 2. (add x as parameter - why do you want to avoid this?).
I have some code below. Now I would want to manipulate my initial top level x variable when an error occurs in the tryCatch statement. I read the documentation, but I can't really figure out how it's supposed to be used.
Some questions I can't get my head around.
What is the workflow for these type of issues? (Should I define a new enviroment for the x variable and reference that enrivoment when I want x in my local function?
How to use the get() function? I suppose for my task I need the get() function, coupled with the superassignment operator <<- or assign.
Something like. x <<- x[! x %in% get(x, envir = "no idea")] is what I need.
You can try out the code by specifying any vector with valid yahoo tickers, such as LoadData(c('YHOO', 'GOOG')). The tryCatch statement is meant to catch any tickers that do not exist, and in that case I want to modify my initial ticker list (the x variable) to not include this ticker name. Thus the need for an get() operation.
LoadData <- function(x) {
if(is.atomic(x) != TRUE & is.data.frame(x) != TRUE) stop('x must be either a data.frame or an atomic object')
if(is.data.frame(x) == TRUE) x <- as.character(x[,1])
df.list <- lapply(x, function(x) {
poss.error <- tryCatch(
{
quantmod::getSymbols(x, env = NULL, return.class = 'data.frame')
},
error = function(e) {
message(cat(x, "could not be retrieved"))
# Get the x variable that was passed to LoadData() and manipulate it.
return(e)
})
}
In the function call LoadData(c('YHOO', 'GOOG')) mentioned in your question, the argument x is not a variable but simply a value. If the value is first stored in a variable, e.g. v, then the value of this variable can be altered by the function. (v is the "global" name outside the function, x is the name inside the function.)
Now consider the function call LoadData(x=v) or simply LoadData(v). To get the variable v from inside the function, two things are needed:
The environment env in which the variable v is stored,
The name under which the variable v is stored in the environment env.
The environment env should be another argument of the function LoadData, perhaps with the global environment as default value:
LoadData <- function(x,env=.GlobalEnv) { ... }
The trick to get the name of the variable passed to the argument x is to use the function match.call. as.list(match.call()) is a named list and as.list(match.call())$x is the "symbol" that is passed to the argument x, i.e. "v" in our case. Then
x.name <- as.character(as.list(match.call())$x`)
is the desired name of the variable passed to the argument x.
Now you can use env[[x.name]] to alter the value of v. The value of v is get(x.name,env), but this is the same as the value of x. So get is not really needed.
Here is a small example:
f <- function( x, v, env=.GlobalEnv )
{
x.name <- as.character(as.list(match.call())$x)
if ( !is.numeric(x) ) { stop(paste0(x.name," must be numeric")) }
env[[x.name]] <- x-v
return(NULL)
}
.
> x <- 5
> y <- 3
> z <- "abc"
> f(x,1)
NULL
> x
[1] 4
> f(y,2)
NULL
> y
[1] 1
> f(z,3)
Error in f(z, 3) : z must be numeric
>
If f is called from another function g to alter the value of a local variable a, the argument env has to be used:
g <- function()
{
a <- 10
print("global environment:")
print(ls(.GlobalEnv))
print("local environment:")
print(ls(environment()))
print("value of `a` before calling `f`:")
print(a)
f(a,1,environment())
print("value of `a` after calling `f`:")
print(a)
return(NULL)
}
.
> g()
[1] "global environment:"
[1] "f" "g" "x" "y" "z"
[1] "local environment:"
[1] "a"
[1] "value of `a` before calling `f`:"
[1] 10
[1] "value of `a` after calling `f`:"
[1] 9
NULL
If the variable passed to LoadData is always the same variable and stored in the global environment, LoadData doesn't need any argument. Then you can simply use <<-.
This question already has answers here:
How do you use "<<-" (scoping assignment) in R?
(7 answers)
Closed 7 years ago.
CASE 1:
rm(list = ls())
foo <- function(x = 6){
set <- function(){
x <- x*x}
set()
x}
foo()
# [1] 6
CASE 2:
rm(list = ls())
foo <- function(x = 6){
set <- function(){
x <<- x*x}
set()
x}
foo()
# [1] 36
I read that <<- operator can be used to assign a value to an object in an environment that is different from the current environment. It says that object initialization using <<- can be done to the objects that is not in the current environment. I want to ask which environment's object can be initialized using <<- . In my case the environment is environment of foo function, can <<-initialize the objects outside the function or the object in the current environment? Totally confused when to use <- and when to use <<-.
The operator <<- is the parent scope assignment operator. It is used to make assignments to variables in the nearest parent scope to the scope in which it is evaluated. These assignments therefore "stick" in the scope outside of function calls. Consider the following code:
fun1 <- function() {
x <- 10
print(x)
}
> x <- 5 # x is defined in the outer (global) scope
> fun1()
[1] 10 # x was assigned to 10 in fun1()
> x
[1] 5 # but the global value of x is unchanged
In the function fun1(), a local variable x is assigned to the value 10, but in the global scope the value of x is not changed. Now consider rewriting the function to use the parent scope assignment operator:
fun2 <- function() {
x <<- 10
print(x)
}
> x <- 5
> fun2()
[1] 10 # x was assigned to 10 in fun2()
> x
[1] 10 # the global value of x changed to 10
Because the function fun2() uses the <<- operator, the assignment of x "sticks" after the function has finished evaluating. What R actually does is to go through all scopes outside fun2() and look for the first scope containing a variable called x. In this case, the only scope outside of fun2() is the global scope, so it makes the assignment there.
As a few have already commented, the <<- operator is frowned upon by many because it can break the encapsulation of your R scripts. If we view an R function as an isolated piece of functionality, then it should not be allowed to interfere with the state of the code which calls it. Abusing the <<- assignment operator runs the risk of doing just this.
The <<- operator can be used to assign a variable to the global environment. It's better to use the assign function than <<-. You probably shouldn't need to use <<- though - outputs needed from functions should be returned as objects in R.
Here's an example
f <- function(x) {
y <<- x * 2 # y outside the function
}
f(5) # y = 10
This is equivalent to
f <- function(x) {
x * 2
}
y <- f(5) # y = 10
With the assign function,
f <- function(x) {
assign('y', x*2 envir=.GlobalEnv)
}
f(5) # y = 10
I have below function. I cannot alter the function in any way except the first block of code in the function.
In this simple example I want to display apply some function on returning object.
The point is the name of variable returned by function may vary and I'm not able to guess it.
Obviously I also cannot wrap the f function into { x <- f(); myfun(x); x }.
The below .Last.value in my on.exit call represents the value to be returned by f function.
f <- function(param){
# the only code I know - start
on.exit(if("character" %in% class(.Last.value)) message(print(.Last.value)) else message(class(.Last.value)))
# the only code I know - end
# real processing of f()
a <- "aaa"
"somethiiiing"
if(param==1L) return(a)
b <- 5L
"somethiiiing"
if(param==2L) return(b)
"somethiiiing"
return(32)
}
f(1L)
# function
# [1] "aaa"
f(2L)
# aaa
# [1] 5
f(3L)
# integer
# [1] 32
Above code with .Last.value seems to be working with lag (so in fact not working) and also the .Last.value is probably not the way to go as I want to use the value few times like if(fun0(x)) fun1(x) else fun2(x), and because returned value might be a big object, copy it on the side is also bad approach.
Any way to use on.exit or any other function which can help me to run my function on the f function results without knowing result variable name?
In a similar way to how you are modifying the function, you could easily wrap it as well. Here's a reproducible example.
library(data.table)
append.log<-function(x) {
cat(paste("value:",x,"\n"))
}
idx.dt <- data.table:::`[.data.table`
environment(idx.dt)<-asNamespace("data.table")
idx.wrap <- function(...) {
x<-do.call(idx.dt, as.list(substitute(...())), envir=parent.frame())
append.log(if(is(x, "data.table")) {
nrow(x)
} else { NA })
x
}
environment(idx.wrap)<-asNamespace("data.table")
(unlockBinding)("[.data.table",asNamespace("data.table"))
assign("[.data.table",idx.wrap,envir=asNamespace("data.table"),inherits=FALSE)
dt<-data.table(a=1:10, b=seq(2, 20, by=2), c=letters[1:10])
dt[a%%2==0]
Since R 3.2.0 it is fully possible, thanks to new function returnValue.
Working example below.
f <- function(x, err = FALSE){
pt <- proc.time()[[3L]]
on.exit(message(paste("proc.time:",round(proc.time()[[3L]]-pt,4),"\nnrow:",as.integer(nrow(returnValue()))[1L])))
Sys.sleep(0.001)
if(err) stop("some error")
return(x)
}
dt <- data.frame(a = 1:5, b = letters[1:5])
f(dt)
f(dt, err=T)
f(dt)
f(dt[dt$a %in% 2:3 & dt$b %in% c("c","d"),])