In R, one can access list elements with $. When one accesses a field which is not included in the list, the resulting value is just NULL. This is problematic in the parts of my code where I further work with object. Take this code:
l <- list(foo = 1, bar = 2)
print(l$foobar)
The output will just be NULL and no error and no warning. I am aware that this might be needed such that assignment of new elements (l$foobar <- 3) can work.
Is there some way where I can make read-access of a field in a list a hard error if it does not exist?
An extreme option is to overload the $ operator, then define an S3 method for list objects that checks the names.
`$` <- function(x, y) {
UseMethod("$")
}
`$.list` <- function(x, y) {
ylab <- deparse(substitute(y))
stopifnot(ylab %in% names(x))
do.call(.Primative("$"), list(x, ylab))
}
Then:
myList <- list('a' = pi)
> myList$b
Error: ylab %in% names(x) is not TRUE
> myList$a
3.141593
You must, of course, be sure to set the following:
`$.default` <- base::`$`
to avoid any conflict with existing usage of the "$" operator
If you wish to continue to use partial matching with the "$" when applied to a list, you can make use of stopifnot(length(pmatch(ylab, names(x)))>0).
I know this is an old question with an already accepted answer writing a custom function, but I think get from base R should do the trick as well:
l <- list(foo = 1, bar = 2)
get('foobar', l)
#> Error in get("foobar", l): object 'foobar' not found
You could try writing your own function:
extract_ls <- function(ls, el){
if (!el %in% names(ls)) stop(paste0("The specified element ", el, " does not exist in the list"))
else return(ls[[el]])
}
Then you can do:
extract_ls(l, "baz")
#Error in extract_ls(l, "baz") :
# The specified element baz does not exist in the list
extract_ls(l, "bar")
#[1] 2
Note that I use [[ instead of $. These two perform the same operation, but $ does partial matching. Also [[ requires a string.
To make this an infix function, just change the function name to `%[[%`. The usage is then:
l %[[% "baz"
#Error in l %[[% "baz" :
# The specified element baz does not exist in the list
l %[[% "bar"
#[1] 2
Try this
el_names <- function(l,variable)
{
if(is.null(l[[variable]]) & length(l)>1)
{
stop("Null found")
}
else{
print(l[[variable]])
}
}
> el_names(l,"foo")
[1] 1
Related
For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"
So I'm changing the class of some functions that I'm building in R in order to add a description attribute and because I want to use S3 generics to handle everything for me. Basically, I have a structure like
foo <- function(x) x + 1
addFunction <- function(f, description) {
class(f) <- c("addFunction", "function")
attr(f, "description") <- description
f
}
foo <- addFunction(foo, "Add one")
and then I do stuff like
description <- function(x) UseMethod("description")
description.default <- function(x) deparse(substitute(x))
description.addFunction <- function(x) attr(x, "description")
This works fine, but it's not that elegant. I'm wondering if it is possible to define a new class of functions such that instances of this class can be defined in a syntax similar to the function syntax. In other words, is it possible to define addFunction such that foo is generated in the following way:
foo <- addFunction(description = "Add one", x) {
x + 1
}
(or something similar, I have no strong feelings about where the attribute should be added to the function)?
Thanks for reading!
Update: I have experimented a bit more with the idea, but haven't really reached any concrete results yet - so this is just an overview of my current (updated) thoughts on the subject:
I tried the idea of just copying the function()-function, giving it a different name and then manipulating it afterwards. However, this does not work and I would love any inputs on what is happening here:
> function2 <- `function`
> identical(`function`, function2)
[1] TRUE
> function(x) x
function(x) x
> function2(x) x
Error: unexpected symbol in "function2(x) x"
> function2(x)
Error: incorrect number of arguments to "function"
As function() is a primitive function, I tried looking at the C-code defining it for more clues. I was particularly intrigued by the error message from the function2(x) call. The C-code underlying function() is
/* Declared with a variable number of args in names.c */
SEXP attribute_hidden do_function(SEXP call, SEXP op, SEXP args, SEXP rho)
{
SEXP rval, srcref;
if (TYPEOF(op) == PROMSXP) {
op = forcePromise(op);
SET_NAMED(op, 2);
}
if (length(args) < 2) WrongArgCount("function");
CheckFormals(CAR(args));
rval = mkCLOSXP(CAR(args), CADR(args), rho);
srcref = CADDR(args);
if (!isNull(srcref)) setAttrib(rval, R_SrcrefSymbol, srcref);
return rval;
}
and from this, I conclude that for some reason, at least two of the four arguments call, op, args and rho are now required. From the signature of do_function() I am guessing that the four arguments passed to do_function should be a call, a promise, a list of arguments and then maybe an environment. I tried a lot of different combinations for function2 (including setting up to two of these arguments to NULL), but I keep getting the same (new) error message:
> function2(call("sum", 2, 1), NULL, list(x=NULL), baseenv())
Error: invalid formal argument list for "function"
> function2(call("sum", 2, 1), NULL, list(x=NULL), NULL)
Error: invalid formal argument list for "function"
This error message is returned from the C-function CheckFormals(), which I also looked up:
/* used in coerce.c */
void attribute_hidden CheckFormals(SEXP ls)
{
if (isList(ls)) {
for (; ls != R_NilValue; ls = CDR(ls))
if (TYPEOF(TAG(ls)) != SYMSXP)
goto err;
return;
}
err:
error(_("invalid formal argument list for \"function\""));
}
I'm not fluent in C at all, so from here on I'm not quite sure what to do next.
So these are my updated questions:
Why do function and function2 not behave in the same way? Why
do I need to call function2 using a different syntax when they are
deemed identical in R?
What are the proper arguments of function2
such that function2([arguments]) will actually define a function?
Some keywords in R such as if and function have special syntax in the way that the underlying functions get called. It's quite easy to use if as a function if desired, e.g.
`if`(1 == 1, "True", "False")
is equivalent to
if (1 == 1) {
"True"
} else {
"False"
}
function is trickier. There's some help on this at a previous question.
For your current problem here's one solution:
# Your S3 methods
description <- function(x) UseMethod("description")
description.default <- function(x) deparse(substitute(x))
description.addFunction <- function(x) attr(x, "description")
# Creates the pairlist for arguments, handling arguments with no defaults
# properly. Also brings in the description
addFunction <- function(description, ...) {
args <- eval(substitute(alist(...)))
tmp <- names(args)
if (is.null(tmp)) tmp <- rep("", length(args))
names(args)[tmp==""] <- args[tmp==""]
args[tmp==""] <- list(alist(x=)$x)
list(args = as.pairlist(args), description = description)
}
# Actually creates the function using the structure created by addFunction and the body
`%{%` <- function(args, body) {
stopifnot(is.pairlist(args$args), class(substitute(body)) == "{")
f <- eval(call("function", args$args, substitute(body), parent.frame()))
class(f) <- c("addFunction", "function")
attr(f, "description") <- args$description
f
}
# Example. Note that the braces {} are mandatory even for one line functions
foo <- addFunction(description = "Add one", x) %{% {
x + 1
}
foo(1)
#[1] 2
Is it possible to return an invisible object when using the S3 indexing function "[" on a custom class? For example, in the code below, is there a way to make the last line of code not print anything?
mat <- function(x) {
structure(x, class="mat")
}
"[.mat" <- function(x, i, j) {
invisible(unclass(x)[i,j])
}
m1 <- mat(matrix(1:10, ncol=2))
m1[1:2,]
[,1] [,2]
[1,] 1 6
[2,] 2 7
You are running into issues with the visibility mechanism caused by primitive functions. Consider:
> length.x <- function(x) invisible(23)
> length(structure(1:10, class="x"))
[1] 23
> mean.x <- function(x) invisible(23)
> mean(structure(1:10, class="x"))
> # no output
length is a primitive, but mean is not. From R Internals:
Whether the returned value of a top-level R expression is printed is controlled by the global boolean variable R_Visible. This is set (to true or false) on entry to all primitive and internal functions based on the eval column of the table in file src/main/names.c: the appropriate setting can be extracted by the macro PRIMPRINT.
and
Internal and primitive functions force the documented setting of R_Visible on return, unless the C code is allowed to change it (the exceptions above are indicated by PRIMPRINT having value 2).
So it would seem that you cannot force invisible returns from primitive generics like [, length, etc., and you must resort to workarounds like the one suggested by Alex.
The problem is that the value returned from [.mat is not of class mat since you're using unclass, so it uses the default printing method for whatever class it has. To fix this, just ensure that the returned object is still a mat and define a printing method for mat objects.
mat <- function(x) {
class(x) <- "mat"
x
}
`[.mat` <- function(x, i, j) {
y <- mat(unclass(x)[i, j])
invisible(y)
}
print.mat <- function(x, ...) {
invisible(x)
}
test <- mat(matrix(1:10, ncol = 2))
test[1, 1]
# Nothing is printed
In my program, I am recursively going over a nested list and adding elements to an overall list that I will return. There are a few details to be taken care of, so I can't just use unlist.
formulaPart is taken to be a formula object.
My code is:
parseVariables <- function(formulaPart, myList){
for(currentVar in as.list(formulaPart))
if(typeof(currentVar == 'language'
parseVariables(currentVar, myList)
else
if(! toString(currentVar) %in% c(\\various characters)
list <- c(list, currentVar)
}
I have checked that the function correctly adds elements to the list when it should. The problem is that the list loses elements due to recursion. The elements added during one inner recursive call are not saved for another recursive call.
If this was in C++, I could just use a pointer; the same for Java. However, I do not understand how to handle this error in R.
R does something like pass-by-value, so you can't modify (most) existing objects just by passing them into a function. If you want to add on to something recursively, one trick would be to use an environment instead, which get passed by reference. This can easily be coerced to list when you're done.
parseVariables <- function(formulaPart, myList){
for(currentVar in as.list(formulaPart)) {
if(typeof(currentVar) == 'language') {
parseVariables(currentVar, myList)
}
else {
if(! toString(currentVar) %in% c(':', '+', '~'))
assign(toString(currentVar), currentVar, myList)
}
}
}
f1 <- z ~ a:b + x
f2 <- z ~ x + y
myList <- new.env()
parseVariables(f1, myList)
parseVariables(f2, mylist)
ls(myList)
# [1] "a" "b" "x" "z"
as.list(myList)
# $x
# x
#
# $z
# z
#
# $a
# a
#
# $b
# b
I'm working on an R package that has a number of functions that follow a non-R-standard practice of modifying in place the object passed in as an argument. This normally works OK, but fails when the object to be modified is on a list.
An function to give an example of the form of the assignments:
myFun<-function(x){
xn <- deparse(substitute(x))
ev <- parent.frame()
# would do real stuff here ..
# instead set simple value to modify local copy
x[[1]]<-"b"
# assign in parent frame
if (exists(xn, envir = ev))
on.exit(assign(xn, x, pos = ev))
# return invisibly
invisible(x)
}
This works:
> myObj <-list("a")
> myFun(myObj)
> myObj
[[1]]
[1] "b"
But it does not work if the object is a member of a list:
> myObj <-list("a")
> myList<-list(myObj,myObj)
> myFun(myList[[1]])
> myList
[[1]]
[[1]][[1]]
[1] "a"
[[2]]
[[2]][[1]]
[1] "a"
After reading answers to other questions here, I see the docs for assign clearly state:
assign does not dispatch assignment methods, so it cannot be used to set elements of vectors, names, attributes, etc.
Since there is an existing codebase using these functions, we cannot abandon the modify-in-place syntax. Does anyone have suggestions for workarounds or alternative approaches for modifying objects which are members of a list in a parent frame?
UPDATE:
I've considered trying to roll my own assignment function, something like:
assignToListInEnv<-function(name,env,value){
# assume name is something like "myList[[1]]"
#check for brackets
index<-regexpr('[[',name,fixed=TRUE)[1]
if(index>0){
lname<-substr(name,0,index-1)
#check that it exists
if (exists(lname,where=env)){
target<-get(lname,pos=env)
# make sure it is a list
if (is.list(target)){
eval(parse(text=paste('target',substr(name,index,999),'<-value',sep='')))
assign(lname, target, pos = env)
} else {
stop('object ',lname,' is not a list in environment ',env)
}
} else {
stop('unable to locate object ',lname,' in frame ',env)
}
}
}
But it seems horrible brittle, would need to handle many more cases ($ and [ as well as [[) and would probably still fail for [[x]] because x would be evaluated in the wrong frame...
Since it was in the first search results to my query, here's my solution :
You can use paste() with "<<-" to create an expression which will assign the value to your list element when evaluated.
assignToListInEnv<-function(name, value, env = parent.frame()){
cl <- as.list(match.call())
lang <- str2lang(paste(cl["name"], "<<-", cl["value"]))
eval(lang, envir = env)
}
EDIT : revisiting this answer because it got a vote up
I'm not sure why I used <<- instead of <-. If using the 'env' argument, <<-with assign to the parent.frame of that env.
So if you always want it to be the first parent.frame it can just be :
assignToListInParentFrame<-function(name, value){
cl <- as.list(match.call())
paste(cl["name"], "<<-", cl["value"]) |>
str2lang() |>
eval()
}
and if you want to precise in which env to modify the list :
assignToListInEnv<-function(name, value, env){
cl <- as.list(match.call())
paste(cl["name"], "<-", cl["value"]) |>
str2lang() |>
eval(envir = env)
}