How can I create a custom assignment using a replacement function? - r

I have defined a function called once as follows:
once <- function(x, value) {
xname <- deparse(substitute(x))
if(!exists(xname)) {
assign(xname, value, env=parent.frame())
}
invisible()
}
The idea is that value is time-consuming to evaluate, and I only want to assign it to x the first time I run a script.
> z
Error: object 'z' not found
> once(z, 3)
> z
[1] 3
I'd really like the usage to be once(x) <- value rather than once(x, value), but if I write a function once<- it gets upset that the variable doesn't exist:
> once(z) <- 3
Error in once(z) <- 3 : object 'z' not found
Does anyone have a way around this?
ps: is there a name to describe functions like once<- or in general f<-?

If you are willing to modify your requirements slightly to use square brackets rather than parentheses then you could do this:
once <- structure(NA, class = "once")
"[<-.once" <- function(once, x, value) {
xname <- deparse(substitute(x))
pf <- parent.frame()
if (!exists(xname, pf)) assign(xname, value, pf)
once
}
# assigns 3 to x (assuming x does not currently exist)
once[x] <- 3
x # 3
# skips assignment (since x now exists)
once[x] <- 4
x # 3

As per item 3.4.4 in the R Language Reference, something like a names replacement is evaluated like this:
`*tmp*` <- x
x <- "names<-"(`*tmp*`, value=c("a","b"))
rm(`*tmp*`)
This is bad news for your requirement, because the assignment will fail on the first line (as x is not found), and even if it would work, your deparse(substitute) call will never evaluate to what you want it to.
Sorry to disappoint you

Related

assigning delayed variables in R

I've just read about delayedAssign(), but the way you have to do it is by passing the name of the delayed variable as the first parameter. Is there a way to do it via direct assignment?
e.g.:
x <- delayed_variable("Hello World")
rather than
delayedAssign("x","Hello World")
I want to create a variable that will throw an error if accessed (use-case is obviously more complex), so for example:
f <- function(x){
y <- delayed_variable(stop("don't use y"))
x
}
f(10)
> 10
f <- function(x){
y <- delayed_variable(stop("don't use y"))
y
}
f(10)
> Error in f(10) : don't use y
No, you can't do it that way. Your example would be fine with the current setup, though:
f <- function(x){
delayedAssign("y", stop("don't use y"))
y
}
f(10)
which gives exactly the error you want. The reason for this limitation is that delayed_variable(stop("don't use y")) would create a value which would trigger the error when evaluated, and assigning it to y would evaluate it.
Another version of the same thing would be
f <- function(x, y = stop("don't use y")) {
...
}
Internally it's very similar to the delayedAssign version.
I reached a solution using makeActiveBinding() which works provided it is being called from within a function (so it doesn't work if called directly and will throw an error if it is). The main purpose of my use-case is a smaller part of this, but I generalised the code a bit for others to use.
Importantly for my use-case, this function can allow other functions to use delayed assignment within functions and can also pass R CMD Check with no Notes.
Here is the function and it gives the desired outputs from my question.
delayed_variable <- function(call){
#Get the current call
prev.call <- sys.call()
attribs <- attributes(prev.call)
# If srcref isn't there, then we're not coming from a function
if(is.null(attribs) || !"srcref" %in% names(attribs)){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Extract the call including the assignment operator
this_call <- parse(text=as.character(attribs$srcref))[[1]]
# Check if this is an assignment `<-` or `=`
if(!(identical(this_call[[1]],quote(`<-`)) ||
identical(this_call[[1]],quote(`=`)))){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Get the variable being assigned to as a symbol and a string
var_sym <- this_call[[2]]
var_str <- deparse(var_sym)
#Get the parent frame that we will be assigining into
p_frame <- parent.frame()
var_env <- new.env(parent = p_frame)
#Create a random string to be an identifier
var_rand <- paste0(sample(c(letters,LETTERS),50,replace=TRUE),collapse="")
#Put the variables into the environment
var_env[["p_frame"]] <- p_frame
var_env[["var_str"]] <- var_str
var_env[["var_rand"]] <- var_rand
# Create the function that will be bound to the variable.
# Since this is an Active Binding (AB), we have three situations
# i) It is run without input, and thus the AB is
# being called on it's own (missing(input)),
# and thus it should evaluate and return the output of `call`
# ii) It is being run as the lhs of an assignment
# as part of the initial assignment phase, in which case
# we do nothing (i.e. input is the output of this function)
# iii) It is being run as the lhs of a regular assignment,
# in which case, we want to overwrite the AB
fun <- function(input){
if(missing(input)){
# No assignment: variable is being called on its own
# So, we activate the delayed assignment call:
res <- eval(call,p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
res
} else if(!inherits(input,"assign_delay") &&
input != var_rand){
# Attempting to assign to the variable
# and it is not the initial definition
# So we overwrite the active binding
res <- eval(substitute(input),p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
invisible(res)
}
# Else: We are assigning and the assignee is the output
# of this function, in which case, we do nothing!
}
#Fix the call in the above eval to be the exact call
# rather than a variable (useful for debugging)
# This is in the line res <- eval(call,p_frame)
body(fun)[[c(2,3,2,3,2)]] <- substitute(call)
#Put the function inside the environment with all
# all of the variables above
environment(fun) <- var_env
# Check if the variable already exists in the calling
# environment and if so, remove it
if(exists(var_str,envir=p_frame)){
rm(list=var_str,envir=p_frame)
}
# Create the AB
makeActiveBinding(var_sym,fun,p_frame)
# Return a specific object to check for
structure(var_rand,call="assign_delay")
}

Can I access the assignment of a function from inside that function? [duplicate]

For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"

Basics of argument passing in R's functions

How R interpret line : arg.list <- list(x, y) in below definition of function, does it copy x and y into arg.list object when execution happen or they are passed by reference ?
fplot <- function(x, y, add=FALSE){
arg.list <- list(x, y)
if(!add){
plot(arg.list))
}else{
lines(arg.list)
}
}
The variables are embedded into the list by reference (at least if you use vectors).
Proof:
library(pryr)
x <- 1:100
y <- 201:200
arg.list <- list(x,y)
al.x <- arg.list[[1]]
al.y <- arg.list[[2]]
Now look at the memory addresses (they are the same):
> address(x)
[1] "0x37598c0"
> address(y)
[1] "0x40fd6f8"
> address(al.x)
[1] "0x37598c0"
> address(al.y)
[1] "0x40fd6f8"
If you change one item a copy will be created ("copy on modification"):
> x[1]=42
> address(x)
[1] "0x417a470"
> al.x <- arg.list[[1]]
> address(al.x)
[1] "0x37598c0"
Edit:
As #HongOoi said: R semantically never uses references (except for objects in the environment class) but copies for variables. It "is clever enough to avoid copies until they are really required" ("copy on [first] modification"). Function parameters are passed "by value" semantically (even though references are used until a modification occurs).
The semantics of R is that function arguments are always passed by value. The underlying implementation may not necessarily make new copies of the arguments, so as to save memory. But your function will behave as if it has brand-new copies to work with.
This means you don't have to worry about changing a variable outside a function because you changed it inside:
x <- 1
f <- function(z) {
z <- z + 1
z
}
y <- f(x)
print(y) # y now contains 2
print(x) # but x still contains 1
If R was pass-by-reference, then modifying the argument of f would also modify the variable that was passed in. This doesn't happen.

What are Replacement Functions in R?

I searched for a reference to learn about replacement functions in R, but I haven't found any yet. I'm trying to understand the concept of the replacement functions in R. I have the code below but I don't understand it:
"cutoff<-" <- function(x, value){
x[x > value] <- Inf
x
}
and then we call cutoff with:
cutoff(x) <- 65
Could anyone explain what a replacement function is in R?
When you call
cutoff(x) <- 65
you are in effect calling
x <- "cutoff<-"(x = x, value = 65)
The name of the function has to be quoted as it is a syntactically valid but non-standard name and the parser would interpret <- as the operator not as part of the function name if it weren't quoted.
"cutoff<-"() is just like any other function (albeit with a weird name); it makes a change to its input argument on the basis of value (in this case it is setting any value in x greater than 65 to Inf (infinite)).
The magic is really being done when you call the function like this
cutoff(x) <- 65
because R is parsing that and pulling out the various bits to make the real call shown above.
More generically we have
FUN(obj) <- value
R finds function "FUN<-"() and sets up the call by passing obj and value into "FUN<-"() and arranges for the result of "FUN<-"() to be assigned back to obj, hence it calls:
obj <- "FUN<-"(obj, value)
A useful reference for this information is the R Language Definition Section 3.4.4: Subset assignment ; the discussion is a bit oblique, but seems to be the most official reference there is (replacement functions are mentioned in passing in the R FAQ (differences between R and S-PLUS), and in the R language reference (various technical issues), but I haven't found any further discussion in official documentation).
Gavin provides an excellent discussion of the interpretation of the replacement function. I wanted to provide a reference since you also asked for that: R Language Definition Section 3.4.4: Subset assignment.
As a complement to the accepted answer I would like to note that replacement functions can be defined also for non standard functions, namely operators (see ?Syntax) and control flow constructs. (see ?Control).
Note also that it is perfectly acceptable to design a generic and associated methods for replacement functions.
operators
When defining a new class it is common to define S3 methods for $<-, [[<- and [<-, some examples are data.table:::`$<-.data.table`, data.table:::`[<-.data.table`, or tibble:::`$.tbl_df`.
However for any other operator we can write a replacement function, some examples :
`!<-` <- function(x, value) !value
x <- NULL # x needs to exist before replacement functions are used!
!x <- TRUE
x
#> [1] FALSE
`==<-` <- function(e1, e2, value) replace(e1, e1 == e2, value)
x <- 1:3
x == 2 <- 200
x
#> [1] 1 200 3
`(<-` <- function(x, value) sapply(x, value, USE.NAMES = FALSE)
x <- c("foo", "bar")
(x) <- toupper
x
#> [1] "FOO" "BAR"
`%chrtr%<-` <- function(e1, e2, value) {
chartr(e2, value, e1)
}
x <- "woot"
x %chrtr% "o" <- "a"
x
#> [1] "waat"
we can even define <-<-, but the parser will prevent its usage if we call x <- y <- z, so we need to use the left to right assignment symbol
`<-<-` <- function(e1, e2, value){
paste(e2, e1, value)
}
x <- "b"
"a" -> x <- "c"
x
#> [1] "a b c"
Fun fact, <<- can have a double role
x <- 1:3
x < 2 <- NA # this fails but `<<-` was called!
#> Error in x < 2 <- NA: incorrect number of arguments to "<<-"
# ok let's define it then!
`<<-` <- function(x, y, value){
if (missing(value)) {
eval.parent(substitute(.Primitive("<<-")(x, y)))
} else {
replace(x, x < y, value)
}
}
x < 2 <- NA
x
#> [1] NA 2 3
x <<- "still works"
x
#> [1] "still works"
control flow constructs
These are in practice seldom encountered (in fact I'm responsible for the only practical use I know, in defining for<- for my package pbfor), but R is flexible enough, or crazy enough, to allow us to define them. However to actually use them, due to the way control flow constructs are parsed, we need to use the left to right assignment ->.
`repeat<-` <- function(x, value) replicate(value, x)
x <- "foo"
3 -> repeat x
x
#> [1] "foo" "foo" "foo"
function<-
function<- can be defined in principle but to the extent of my knowledge we can't do anything with it.
`function<-` <- function(x,value){NULL}
3 -> function(arg) {}
#> Error in function(arg) {: target of assignment expands to non-language object
Remember, in R everything operation is a function call (therefore also the assignment operations) and everything that exists is an object.
Replacement functions act as if they modify their arguments in place such as in
colnames(d) <- c("Input", "Output")
They have the identifier <- at the end of their name and return a modified copy of the argument object (non-primitive replacement functions) or the same object (primitive replacement functions)
At the R prompt, the following will not work:
> `second` <- function(x, value) {
+ x[2] <- value
+ x
+ }
> x <- 1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
> second(x) <- 9
Error in second(x) <- 9: couldn't find function "second<-"
As you can see, R is searching the environment not for second but for second<-.
So lets do the same thing but using such a function identifier instead:
> `second<-` <- function(x, value) {
+ x[2] <- value
+ x
+ }
Now, the assignment at the second position of the vector works:
> second(x) <- 9
> x
[1] 1 9 3 4 5 6 7 8 9 10
I also wrote a simple script to list all replacement functions in R base package, find it here.

Is there a "move" command in R (equivalent of <- followed by rm)?

This is one of those "there has to be a function for this" questions. It's not that big a deal, but it's just annoying enough that every time I rename an object I wonder if there's a better way.
Suppose I capitalize an object that I've created and realize I'd rather have it uncapitalized:
# Create test data
X <- runif(100)
# Rename the object
x <- X
rm(X)
Is there a one-command way of doing this (that also avoids the re-copy for memory/speed reasons)? There are a few commands named rename in various packages but they all work on elements within a list, rather than on the list (or other object) itself.
I don't know of a built in way to do this but you could easily write your own function to do something along these lines. For instance this does just that without any checking to make sure the object exists or whether or not there is already an object named what you want to rename to.
mv <- function(x, y){
x_name <- deparse(substitute(x))
y_name <- deparse(substitute(y))
assign(y_name, x, pos = 1)
rm(list = x_name, pos = 1)
invisible()
}
Some example use
> x <- 3
> x
[1] 3
> y
Error: object 'y' not found
> mv(x, y)
> x
Error: object 'x' not found
> y
[1] 3
Edit: For those that didn't follow the link in the comments here is a version written by Rolf Turner that does some checking to make sure the object we want to move actually exists and asks us if we want to overwrite an existing object if the new name already has an object in it.
mv <- function (a, b) {
anm <- deparse(substitute(a))
bnm <- deparse(substitute(b))
if (!exists(anm,where=1,inherits=FALSE))
stop(paste(anm, "does not exist.\n"))
if (exists(bnm,where=1,inherits=FALSE)) {
ans <- readline(paste("Overwrite ", bnm, "? (y/n) ", sep = ""))
if (ans != "y")
return(invisible())
}
assign(bnm, a, pos = 1)
rm(list = anm, pos = 1)
invisible()
}

Resources