How R interpret line : arg.list <- list(x, y) in below definition of function, does it copy x and y into arg.list object when execution happen or they are passed by reference ?
fplot <- function(x, y, add=FALSE){
arg.list <- list(x, y)
if(!add){
plot(arg.list))
}else{
lines(arg.list)
}
}
The variables are embedded into the list by reference (at least if you use vectors).
Proof:
library(pryr)
x <- 1:100
y <- 201:200
arg.list <- list(x,y)
al.x <- arg.list[[1]]
al.y <- arg.list[[2]]
Now look at the memory addresses (they are the same):
> address(x)
[1] "0x37598c0"
> address(y)
[1] "0x40fd6f8"
> address(al.x)
[1] "0x37598c0"
> address(al.y)
[1] "0x40fd6f8"
If you change one item a copy will be created ("copy on modification"):
> x[1]=42
> address(x)
[1] "0x417a470"
> al.x <- arg.list[[1]]
> address(al.x)
[1] "0x37598c0"
Edit:
As #HongOoi said: R semantically never uses references (except for objects in the environment class) but copies for variables. It "is clever enough to avoid copies until they are really required" ("copy on [first] modification"). Function parameters are passed "by value" semantically (even though references are used until a modification occurs).
The semantics of R is that function arguments are always passed by value. The underlying implementation may not necessarily make new copies of the arguments, so as to save memory. But your function will behave as if it has brand-new copies to work with.
This means you don't have to worry about changing a variable outside a function because you changed it inside:
x <- 1
f <- function(z) {
z <- z + 1
z
}
y <- f(x)
print(y) # y now contains 2
print(x) # but x still contains 1
If R was pass-by-reference, then modifying the argument of f would also modify the variable that was passed in. This doesn't happen.
Related
What factors should I consider when deciding whether or not to remove a variable that will not be used again in a function?
Here's a noddy example:
DivideByLower <- function (a, b) {
if (a > b) {
tmp <- a
a <- b
b <- tmp
remove(tmp) # When should I include this line?
}
# Return:
a / b
}
I understand that tmp will be removed when the function finishes executing, but should I ever be concerned about removing it earlier?
From Hadley Wickham's advanced R :
In some languages, you have to explicitly delete unused objects for
their memory to be returned. R uses an alternative approach: garbage
collection (or GC for short). GC automatically releases memory when an
object is no longer used. It does this by tracking how many names
point to each object, and when there are no names pointing to an
object, it deletes that object.
In the case you're describing garbage collection will release the memory.
In case the output of your function is another function, in which case Hadley names these functions respectively the function factory and the manufactured function, the variables created in the body of the function factory will be available in the enclosing environment of the manufactured function, and memory won't be freed.
More info, still in Hadley's book, can be found in the chapter about function factories.
function_factory <- function(x){
force(x)
y <- "bar"
fun <- function(z){
sprintf("x, y, and z are all accessible and their values are '%s', '%s', and '%s'",
x, y, z)
}
fun
}
manufactured_function <- function_factory("foo")
manufactured_function("baz")
#> [1] "x, y, and z are all accessible and their values are 'foo', 'bar', and 'baz'"
Created on 2019-07-08 by the reprex package (v0.3.0)
In this case, if you want to control which variables are available in the enclosing environment, or be sure you don't clutter your memory, you might want to remove unnecessary objects, either by using rm / remove as you did, or as I tend to prefer, wrapped in an on.exit statement.
Another case in which I might use rm is if I want to access variables from a parent environment without risk of them being overriden inside of the function, but in that case it's often possible and cleaner to use eval.parent.
y <- 2
z <- 3
test0 <- function(x, var){
y <- 1
x + eval(substitute(var))
}
# opps, the value of y is the one defined in the body
test0(0, y)
#> [1] 1
test0(0, z)
#> [1] 3
# but it will work using eval.parent :
test1 <- function(x, var){
y <- 1
x + eval.parent(substitute(var))
}
test1(0, y)
#> [1] 2
test1(0, z)
#> [1] 3
# in some cases (better avoided), it can be easier/quick and dirty to do something like :
test2 <- function(x, var){
y <- 1
# whatever code using y
rm(y)
x + eval(substitute(var))
}
test2(0, y)
#> [1] 2
test2(0, z)
#> [1] 3
Created on 2019-07-08 by the reprex package (v0.3.0)
For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"
I searched for a reference to learn about replacement functions in R, but I haven't found any yet. I'm trying to understand the concept of the replacement functions in R. I have the code below but I don't understand it:
"cutoff<-" <- function(x, value){
x[x > value] <- Inf
x
}
and then we call cutoff with:
cutoff(x) <- 65
Could anyone explain what a replacement function is in R?
When you call
cutoff(x) <- 65
you are in effect calling
x <- "cutoff<-"(x = x, value = 65)
The name of the function has to be quoted as it is a syntactically valid but non-standard name and the parser would interpret <- as the operator not as part of the function name if it weren't quoted.
"cutoff<-"() is just like any other function (albeit with a weird name); it makes a change to its input argument on the basis of value (in this case it is setting any value in x greater than 65 to Inf (infinite)).
The magic is really being done when you call the function like this
cutoff(x) <- 65
because R is parsing that and pulling out the various bits to make the real call shown above.
More generically we have
FUN(obj) <- value
R finds function "FUN<-"() and sets up the call by passing obj and value into "FUN<-"() and arranges for the result of "FUN<-"() to be assigned back to obj, hence it calls:
obj <- "FUN<-"(obj, value)
A useful reference for this information is the R Language Definition Section 3.4.4: Subset assignment ; the discussion is a bit oblique, but seems to be the most official reference there is (replacement functions are mentioned in passing in the R FAQ (differences between R and S-PLUS), and in the R language reference (various technical issues), but I haven't found any further discussion in official documentation).
Gavin provides an excellent discussion of the interpretation of the replacement function. I wanted to provide a reference since you also asked for that: R Language Definition Section 3.4.4: Subset assignment.
As a complement to the accepted answer I would like to note that replacement functions can be defined also for non standard functions, namely operators (see ?Syntax) and control flow constructs. (see ?Control).
Note also that it is perfectly acceptable to design a generic and associated methods for replacement functions.
operators
When defining a new class it is common to define S3 methods for $<-, [[<- and [<-, some examples are data.table:::`$<-.data.table`, data.table:::`[<-.data.table`, or tibble:::`$.tbl_df`.
However for any other operator we can write a replacement function, some examples :
`!<-` <- function(x, value) !value
x <- NULL # x needs to exist before replacement functions are used!
!x <- TRUE
x
#> [1] FALSE
`==<-` <- function(e1, e2, value) replace(e1, e1 == e2, value)
x <- 1:3
x == 2 <- 200
x
#> [1] 1 200 3
`(<-` <- function(x, value) sapply(x, value, USE.NAMES = FALSE)
x <- c("foo", "bar")
(x) <- toupper
x
#> [1] "FOO" "BAR"
`%chrtr%<-` <- function(e1, e2, value) {
chartr(e2, value, e1)
}
x <- "woot"
x %chrtr% "o" <- "a"
x
#> [1] "waat"
we can even define <-<-, but the parser will prevent its usage if we call x <- y <- z, so we need to use the left to right assignment symbol
`<-<-` <- function(e1, e2, value){
paste(e2, e1, value)
}
x <- "b"
"a" -> x <- "c"
x
#> [1] "a b c"
Fun fact, <<- can have a double role
x <- 1:3
x < 2 <- NA # this fails but `<<-` was called!
#> Error in x < 2 <- NA: incorrect number of arguments to "<<-"
# ok let's define it then!
`<<-` <- function(x, y, value){
if (missing(value)) {
eval.parent(substitute(.Primitive("<<-")(x, y)))
} else {
replace(x, x < y, value)
}
}
x < 2 <- NA
x
#> [1] NA 2 3
x <<- "still works"
x
#> [1] "still works"
control flow constructs
These are in practice seldom encountered (in fact I'm responsible for the only practical use I know, in defining for<- for my package pbfor), but R is flexible enough, or crazy enough, to allow us to define them. However to actually use them, due to the way control flow constructs are parsed, we need to use the left to right assignment ->.
`repeat<-` <- function(x, value) replicate(value, x)
x <- "foo"
3 -> repeat x
x
#> [1] "foo" "foo" "foo"
function<-
function<- can be defined in principle but to the extent of my knowledge we can't do anything with it.
`function<-` <- function(x,value){NULL}
3 -> function(arg) {}
#> Error in function(arg) {: target of assignment expands to non-language object
Remember, in R everything operation is a function call (therefore also the assignment operations) and everything that exists is an object.
Replacement functions act as if they modify their arguments in place such as in
colnames(d) <- c("Input", "Output")
They have the identifier <- at the end of their name and return a modified copy of the argument object (non-primitive replacement functions) or the same object (primitive replacement functions)
At the R prompt, the following will not work:
> `second` <- function(x, value) {
+ x[2] <- value
+ x
+ }
> x <- 1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
> second(x) <- 9
Error in second(x) <- 9: couldn't find function "second<-"
As you can see, R is searching the environment not for second but for second<-.
So lets do the same thing but using such a function identifier instead:
> `second<-` <- function(x, value) {
+ x[2] <- value
+ x
+ }
Now, the assignment at the second position of the vector works:
> second(x) <- 9
> x
[1] 1 9 3 4 5 6 7 8 9 10
I also wrote a simple script to list all replacement functions in R base package, find it here.
This is one of those "there has to be a function for this" questions. It's not that big a deal, but it's just annoying enough that every time I rename an object I wonder if there's a better way.
Suppose I capitalize an object that I've created and realize I'd rather have it uncapitalized:
# Create test data
X <- runif(100)
# Rename the object
x <- X
rm(X)
Is there a one-command way of doing this (that also avoids the re-copy for memory/speed reasons)? There are a few commands named rename in various packages but they all work on elements within a list, rather than on the list (or other object) itself.
I don't know of a built in way to do this but you could easily write your own function to do something along these lines. For instance this does just that without any checking to make sure the object exists or whether or not there is already an object named what you want to rename to.
mv <- function(x, y){
x_name <- deparse(substitute(x))
y_name <- deparse(substitute(y))
assign(y_name, x, pos = 1)
rm(list = x_name, pos = 1)
invisible()
}
Some example use
> x <- 3
> x
[1] 3
> y
Error: object 'y' not found
> mv(x, y)
> x
Error: object 'x' not found
> y
[1] 3
Edit: For those that didn't follow the link in the comments here is a version written by Rolf Turner that does some checking to make sure the object we want to move actually exists and asks us if we want to overwrite an existing object if the new name already has an object in it.
mv <- function (a, b) {
anm <- deparse(substitute(a))
bnm <- deparse(substitute(b))
if (!exists(anm,where=1,inherits=FALSE))
stop(paste(anm, "does not exist.\n"))
if (exists(bnm,where=1,inherits=FALSE)) {
ans <- readline(paste("Overwrite ", bnm, "? (y/n) ", sep = ""))
if (ans != "y")
return(invisible())
}
assign(bnm, a, pos = 1)
rm(list = anm, pos = 1)
invisible()
}
I have defined a function called once as follows:
once <- function(x, value) {
xname <- deparse(substitute(x))
if(!exists(xname)) {
assign(xname, value, env=parent.frame())
}
invisible()
}
The idea is that value is time-consuming to evaluate, and I only want to assign it to x the first time I run a script.
> z
Error: object 'z' not found
> once(z, 3)
> z
[1] 3
I'd really like the usage to be once(x) <- value rather than once(x, value), but if I write a function once<- it gets upset that the variable doesn't exist:
> once(z) <- 3
Error in once(z) <- 3 : object 'z' not found
Does anyone have a way around this?
ps: is there a name to describe functions like once<- or in general f<-?
If you are willing to modify your requirements slightly to use square brackets rather than parentheses then you could do this:
once <- structure(NA, class = "once")
"[<-.once" <- function(once, x, value) {
xname <- deparse(substitute(x))
pf <- parent.frame()
if (!exists(xname, pf)) assign(xname, value, pf)
once
}
# assigns 3 to x (assuming x does not currently exist)
once[x] <- 3
x # 3
# skips assignment (since x now exists)
once[x] <- 4
x # 3
As per item 3.4.4 in the R Language Reference, something like a names replacement is evaluated like this:
`*tmp*` <- x
x <- "names<-"(`*tmp*`, value=c("a","b"))
rm(`*tmp*`)
This is bad news for your requirement, because the assignment will fail on the first line (as x is not found), and even if it would work, your deparse(substitute) call will never evaluate to what you want it to.
Sorry to disappoint you