I am new to using match.arg for default value specification in R functions. And I have a query regarding the below behavior.
trial_func <- function(a=c("1","9","20"),b=c("12","3"),d=c("55","01")){
a <- match.arg(a)
b <- match.arg(b)
d <- match.arg(d)
list(a,b,d)
}
trial_func()
# [[1]]
# [1] "1"
#
# [[2]]
# [1] "12"
#
# [[3]]
# [1] "55"
When I try using match.arg for each individual argument, it works as expected. But when I try to use an lapply to reduce the code written, it causes the below issue.
trial_func_apply <- function(a=c("1","9","20"),b=c("12","3"),d=c("55","01")){
lapply(list(a,b,d), match.arg)
}
trial_func_apply()
Error in FUN(X[[i]], ...) : 'arg' must be of length 1
Am I missing something here?
It's an old question, but I feel it's a great one, so I will try to provide extensive explanation for it by explaining the following:
Read the relevant documentation for ?match.arg
Make match.arg fail to guess the choices
Learn three features of the R language that match.arg uses underneath.
Simplified match.arg implementation
Make the lapply example of the question work
match.arg documentation
The usage tells you that match.arg needs the selected option you want to match (arg) and all the possible choices:
match.arg(arg, choices, several.ok = FALSE)
If we read choices, we see that it can often be missing, and we should read more in the details... How could match.arg work without having the possible choices, we wonder?
choices: a character vector of candidate values, often missing, see
‘Details’.
Maybe the Details section gives some hints (bold is mine):
Details:
In the one-argument form ‘match.arg(arg)’, the choices are
obtained from a default setting for the formal argument ‘arg’ of
the function from which ‘match.arg’ was called. (Since default
argument matching will set ‘arg’ to ‘choices’, this is allowed as
an exception to the ‘length one unless ‘several.ok’ is ‘TRUE’’
rule, and returns the first element.)
So, if you don't specify the choices argument, R will make a bit of effort to guess it right automagically. For the R magic to work, several conditions must be fulfilled:
The match.arg function must be called directly from the function with the argument
The name of the variable to be matched must be the name of the argument.
match.arg() can be tricked:
Let's make match.arg() fail to guess the choices:
dummy_fun1 <- function(x = c("a", "b"), y = "c") {
# If you name your argument like another argument
y <- x
# The guessed choices will correspond to y (how could it know they were x?)
wrong_choices <- match.arg(y)
}
dummy_fun1(x = "a")
# Error in match.arg(y) : 'arg' should be “c”
dummy_fun2 <- function(x = c("a", "b"), y = "c") {
# If you name your argument differently
z <- x
# You don't get any guess:
wrong_choices <- match.arg(z)
}
dummy_fun2(x="a")
#Error in match.arg(z) : 'arg' should be one of
Three R language features that match.arg needs and uses
(1) It uses non-standard evaluation to get the name of the variable:
whats_the_var_name_called <- function(arg) {
as.character(substitute(arg))
}
x <- 3
whats_the_var_name_called(x)
# "x"
y <- x
whats_the_var_name_called(y)
# "y"
(2) It uses sys.function() to get the caller function:
this_function_returns_its_caller <- function() {
sys.function(1)
}
this_function_returns_itself <- function() {
me <- this_function_returns_its_caller()
message("This is the body of this_function_returns_itself")
me
}
> this_function_returns_itself()
This is the body of this_function_returns_itself
function() {
me <- this_function_returns_its_caller()
message("This is the body of this_function_returns_itself")
me
}
(3) It uses formals() to get the possible values:
a_function_with_default_values <- function(x=c("a", "b"), y = 3) {
}
formals(a_function_with_default_values)[["x"]]
#c("a", "b")
How does match.arg work?
Combining these things, match.arg uses substitute() to get the name of the args variable, it uses sys.function() to get the caller function, and it uses formals() on the caller function with the argument name to get the default values of the function (the choices):
get_choices <- function(arg, choices) {
if (missing(choices)) {
arg_name <- as.character(substitute(arg))
caller_fun <- sys.function(1)
choices_as_call <- formals(caller_fun)[[arg_name]]
choices <- eval(choices_as_call)
}
choices
}
dummy_fun3 <- function(x = c("a", "b"), y = "c") {
get_choices(x)
}
dummy_fun3()
#[1] "a" "b"
Since we now know the magic used to get the choices, so we can create our match.arg implementation:
my_match_arg <- function(arg, choices) {
if (missing(choices)) {
arg_name <- as.character(substitute(arg))
caller_fun <- sys.function(1)
choices_as_call <- formals(caller_fun)[[arg_name]]
choices <- eval(choices_as_call)
}
# Really simple and cutting corners... but you get the idea:
arg <- arg[1]
if (! arg %in% choices) {
stop("Wrong choice")
}
arg
}
dummy_fun4 <- function(x = c("a", "b"), y = "c") {
my_match_arg(x)
}
dummy_fun4(x="d")
# Error in my_match_arg(x) : Wrong choice
dummy_fun4(x="a")
# [1] "a"
And that's how match.arg works.
Why it does not work under lapply? How to fix it?
To guess the choices argument, we look at the caller argument. When we use match.arg() inside an lapply call, the caller is not our function, so match.arg fails to guess the choices. We can get the choices manually and provide the choices manually:
trial_func_apply <- function(a=c("1","9","20"),b=c("12","3"),d=c("55","01")){
this_func <- sys.function()
the_args <- formals(this_func)
default_choices <- list(
eval(the_args[["a"]]),
eval(the_args[["b"]]),
eval(the_args[["d"]])
)
# mapply instead of lapply because we have two lists we
# want to apply match.arg to
mapply(match.arg, list(a,b,d), default_choices)
}
trial_func_apply()
# [1] "1" "12" "55"
Please note that I am cutting corners by not defining the environments where all the evals should happen, because in the examples above they work as-is. There may be some corner cases that make this examples to fail, so don't use them in production.
After investigating a bit, you need to pass the argument that your character vector is NULL, i.e.
trial_func_apply <- function(a=c("1","9","20"),b=c("12","3"),d=c("55","01")){
lapply(list(a,b,d), function(i)match.arg(NULL, i))
}
trial_func_apply()
#[[1]]
#[1] "1"
#[[2]]
#[1] "12"
#[[3]]
#[1] "55"
Related
I have a function f that takes three arguments, and returns the last one. For example:
f <- function(x,y,z){
return(z)
}
f(11,22,33) #yields 33
However, later on I may have more/less than three arguments, so I want to create a function g that returns the last argument of ...
g <- function(...){
#return final argument in '...'
}
g(11,22) #should yield 22
g(11,22,33,44,'abc') #should yield 'abc'
Is there any simple way to do this?
I've looked at existing posts on using ..., but they all seem to use it to pass all the arguments to another function (which is not what I'm trying to do).
I could just make the argument into a vector, and return the last element, but I'd like to avoid that if possible.
Use ...length and ...elt like this:
f <- function(...) ...elt(...length())
f(11, 12, 13)
## [1] 13
g <- function(...) {
dots <- list(...)
if (length(dots)) dots[[length(dots)]]
}
g(11,22)
# [1] 22
g(11,22,33,44,'abc')
# [1] "abc"
g() # returns nothing, NULL invisibly
I often choose to assign to dots or similar in the function and then deal with it, though that can be both good and bad. If you want to pass the args on to other functions, then you can still use ... as before (or you can use do.call(..), though that's not your question). One side-effect is that by doing this, the ... are evaluated, which though generally fine, in some corner-cases this evaluation may be too soon.
A demonstration:
g(stop("quux"), "abc")
# Error in g(stop("quux"), "abc") : quux
If you want to avoid early evaluation of other args, you can use match.call:
g <- function(...){
cl <- match.call(expand.dots = TRUE)
if (length(cl) > 1) cl[[length(cl)]]
}
g() # nothing
g(1)
# [1] 1
g(stop("quux"), "abc")
# [1] "abc"
As applied to the same R code or objects, quote and substitute typically return different objects. How can one make this difference apparent?
is.identical <- function(X){
out <- identical(quote(X), substitute(X))
out
}
> tmc <- function(X){
out <- list(typ = typeof(X), mod = mode(X), cls = class(X))
out
}
> df1 <- data.frame(a = 1, b = 2)
Here the printed output of quote and substitute are the same.
> quote(df1)
df1
> substitute(df1)
df1
And the structure of the two are the same.
> str(quote(df1))
symbol df1
> str(substitute(df1))
symbol df1
And the type, mode and class are all the same.
> tmc(quote(df1))
$typ
[1] "symbol"
$mod
[1] "name"
$cls
[1] "name"
> tmc(substitute(df1))
$typ
[1] "symbol"
$mod
[1] "name"
$cls
[1] "name"
And yet, the outputs are not the same.
> is.identical(df1)
[1] FALSE
Note that this question shows some inputs that cause the two functions to display different outputs. However, the outputs are different even when they appear the same, and are the same by most of the usual tests, as shown by the output of is.identical() above. What is this invisible difference, and how can I make it appear?
note on the tags: I am guessing that the Common LISP quote and the R quote are similar
The reason is that the behavior of substitute() is different based on where you call it, or more precisely, what you are calling it on.
Understanding what will happen requires a very careful parsing of the (subtle) documentation for substitute(), specifically:
Substitution takes place by examining each component of the parse tree
as follows: If it is not a bound symbol in env, it is unchanged. If it
is a promise object, i.e., a formal argument to a function or
explicitly created using delayedAssign(), the expression slot of the
promise replaces the symbol. If it is an ordinary variable, its value
is substituted, unless env is .GlobalEnv in which case the symbol is
left unchanged.
So there are essentially three options.
In this case:
> df1 <- data.frame(a = 1, b = 2)
> identical(quote(df1),substitute(df1))
[1] TRUE
df1 is an "ordinary variable", but it is called in .GlobalEnv, since env argument defaults to the current evaluation environment. Hence we're in the very last case where the symbol, df1, is left unchanged and so it identical to the result of quote(df1).
In the context of the function:
is.identical <- function(X){
out <- identical(quote(X), substitute(X))
out
}
The important distinction is that now we're calling these functions on X, not df1. For most R users, this is a silly, trivial distinction, but when playing with subtle tools like substitute it becomes important. X is a formal argument of a function, so that implies we're in a different case of the documented behavior.
Specifically, it says that now "the expression slot of the promise replaces the symbol". We can see what this means if we debug() the function and examine the objects in the context of the function environment:
> debugonce(is.identical)
> is.identical(X = df1)
debugging in: is.identical(X = df1)
debug at #1: {
out <- identical(quote(X), substitute(X))
out
}
Browse[2]>
debug at #2: out <- identical(quote(X), substitute(X))
Browse[2]> str(quote(X))
symbol X
Browse[2]> str(substitute(X))
symbol df1
Browse[2]> Q
Now we can see that what happened is precisely what the documentation said would happen (Ha! So obvious! ;) )
X is a formal argument, or a promise, which according to R is not the same thing as df1. For most people writing functions, they are effectively the same, but the internal implementation disagrees. X is a promise object, and substitute replaces the symbol X with the one that it "points to", namely df1. This is what the docs mean by the "expression slot of the promise"; that's what R sees when in the X = df1 part of the function call.
To round things out, try to guess what will happen in this case:
is.identical <- function(X){
out <- identical(quote(A), substitute(A))
out
}
is.identical(X = df1)
(Hint: now A is not a "bound symbol in the environment".)
A final example illustrating more directly the final case in the docs with the confusing exception:
#Ordinary variable, but in .GlobalEnv
> a <- 2
> substitute(a)
a
#Ordinary variable, but NOT in .GlobalEnv
> e <- new.env()
> e$a <- 2
> substitute(a,env = e)
[1] 2
For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"
Imagine you have a simple function that specifies which statistical tests to run for each variable. Its syntax, simplified for the purposes of this question is as follows:
test <- function(...) {
x <- list(...)
return(x)
}
which takes argument pairs such as Gender = 'Tukey', and intends to pass its result to other functions down the line. The output of test() is as follows:
test(Gender = 'Tukey')
# $Gender
# [1] "Tukey"
What is desired is the ability to replace the literal Gender by a dynamically assigned variable varname (e.g., for looping purposes). Currently what happens is:
varname <- 'Gender'
test(varname = 'Tukey')
# $varname
# [1] "Tukey"
but what is desired is this:
varname <- 'Gender'
test(varname = 'Tukey')
# $Gender
# [1] "Tukey"
I tried tinkering with functions such as eval() and parse(), but to no avail. In practice, I resolved the issue by simply renaming the resulting list, but it is an ugly solution and I am sure there is an elegant R way to achieve it. Thank in advance for the educational value of your answer.
NB: This question occurred to me while trying to program a custom function which uses mcp() from the effects package in its internals. The said mcp() function is the real world counterpart of test().
EDIT1: Perhaps it needs to be clarified that (for educational purposes) changing test() is not an option. The question is about how to pass the tricky argument to test(). If you take a look at NB, it becomes clear why: the real world counterpart of test(), namely mcp(), comes with a package. And while it is possible to create a modified copy of it, I am really curious whether there exists a simple solution in somehow 'converting' the dynamically assigned variable to a literal in the context of dot-arguments.
This works:
test <- function(...) {
x = list(...)
names(x) <- sapply(names(x),
function(p) eval(as.symbol(p)))
return(x)
}
apple = "orange"
test(apple = 5)
We can use
test <- function(...) {
x <- list(...)
if(exists(names(x))) names(x) <- get(names(x))
x
}
test(Gender = 'Tukey')
#$Gender
#[1] "Tukey"
test(varname = 'Tukey')
#$Gender
#[1] "Tukey"
What about this:
varname <- "Gender"
args <- list()
args[[varname]] <- "Tukey"
do.call(test, args)
I'm new to R and I'm currently trying to supply the enumeration-like argument to the R function (or the RC/R6 class method), I currently use character vector plus match.arg similar to the following:
EnumTest = function(enum = c("BLUE", "RED", "BLACK")) {
enumArg <-
switch(
match.arg(enum), "BLUE" = 0L, "RED" = 1L, "BLACK" = 2L
)
switch(enumArg,
# do something
)
}
Is there are better/more concise way to imitate enum-like behavior in R? E.g. one big problem that user has to know the set of possible values for the argument and manually type it as a string - without any suggestion or auto-completion...
If there is no other better way, one thing that could improve above approach - it'd be nice to make it more concise by say predefining enums globally or say as private members of R6 class:
Color <- c("BLUE", "RED", "BLACK")
Then one could (re)use it in one or more function definitions, e.g.:
EnumTest = function(enum = Color) {
...
However, I'm not sure how to use this Color vector in match.arg function. It'd be nice if I could define Color as a map with keys being actual color values and values being integer representation - but I'm not sure how sensible that is.. Anyways, maybe there are more common neat approaches exist.
The main goal would be to provide an easy-to-use intuitive interface to the user of my package and functions (e.g. easy way to find the set of possible values, tab-completion, auto-suggestion, etc..), followed by standardized development of such functions using enum-like arguments
How about using a function that defines the enum by returning list(a= "a", ...)? You can then either assign the returned vector to a variable and use it in context, or use the function directly. Either a name or an integer reference will work as an index, although you have to use the unlist version of the index lookup, [[, otherwise you get a list with one element.
colorEnum <- function() {
list(BLUE = "BLUE", RED = "RED", BLACK = "BLACK")
}
colorEnum()$BLUE
#> [1] "BLUE"
colorEnum()[[1]]
#> [1] "BLUE"
colorEnum()[1]
#> $BLUE
#> [1] "BLUE"
col <- colorEnum()
col$BLUE
#> [1] "BLUE"
col[[1]]
#> [1] "BLUE"
col$BAD_COLOR
#> NULL
col[[5]]
#> Error in col[[5]] : subscript out of bounds
You can get the list of names for use in a match, i.e. your function parameter could be
EnumTest = function( enum = names(colorEnum()) { ...
You can actually abbreviate too, but it must be unique. (If you use RStudio, since col is a list, it will suggest completions!)
col$BLA
#> [1] "BLACK"
col$BL
#> NULL
If you want more sophisticated enum handling, you could assign S3 classes to the thing returned by your enum constructor function and write a small collection of functions to dispatch on class "enum" and allow case-insensitive indexing. You could also add special functions to work with a specific class, e.g. "colorEnum"; I have not done that here. Inheritance means the list access methods all still work.
colorEnum2 <- function() {
structure(
list(BLUE = "BLUE", RED = "RED", BLACK = "BLACK"),
class= c("colorEnum2", "enum", "list")
)
}
# Note, changed example to allow multiple returned values.
`[.enum` <- function(x, i) {
if ( is.character( i ))
i <- toupper(i)
class(x) <- "list"
names(as.list(x)[i])
}
`[[.enum` <- function(x, i, exact= FALSE) {
if ( is.character( i ))
i <- toupper(i)
class(x) <- "list"
as.list(x)[[i, exact=exact]]
}
`$.enum` <- function(x, name) {
x[[name]]
}
col <- colorEnum2()
# All these return [1] "RED"
col$red
col$r
col[["red"]]
col[["r"]]
col["red"]
col[c("red", "BLUE")]
#> [1] "RED" "BLUE"
col["r"]
[1] NA # R does not matches partial strings with "["
These override the built in [, [[ and $ functions when the thing being indexed is of class "enum", for any "enum" classed objects. If you need another one, you just need to define it.
directionEnum <- function() {
structure(
list(LEFT = "LEFT", RIGHT = "RIGHT"),
class= c("directionEnum", "enum", "list")
)
}
directionEnum()$l
#> [1] "LEFT"
If you need several enum objects, you could add a factory function enum that takes a vector of strings and a name and returns an enum object. Most of this is just validation.
enum <- function(enums, name= NULL) {
if (length(enums) < 1)
stop ("Enums may not be empty." )
enums <- toupper(as.character(enums))
uniqueEnums <- unique(enums)
if ( ! identical( enums, uniqueEnums ))
stop ("Enums must be unique (ignoring case)." )
validNames <- make.names(enums)
if ( ! identical( enums, validNames ))
stop( "Enums must be valid R identifiers." )
enumClass <- c(name, "enum", "list")
obj <- as.list(enums)
names(obj) <- enums
structure( obj, class= enumClass)
}
col <- enum(c("BLUE", "red", "Black"), name = "TheColors")
col$R
#> [1] "RED"
class(col)
#> [1] "TheColors" "enum" "list"
side <- enum(c("left", "right"))
side$L
#> [1] "LEFT"
class(side)
#> [1] "enum" "list"
But now this is starting to look like a package...
I like to use environments as replacement for enums because you can lock them to prevent any changes after creation. I define my creation function like this:
Enum <- function(...) {
## EDIT: use solution provided in comments to capture the arguments
values <- sapply(match.call(expand.dots = TRUE)[-1L], deparse)
stopifnot(identical(unique(values), values))
res <- setNames(seq_along(values), values)
res <- as.environment(as.list(res))
lockEnvironment(res, bindings = TRUE)
res
}
Create a new enum like this:
FRUITS <- Enum(APPLE, BANANA, MELON)
We can the access the values:
FRUITS$APPLE
But we cannot modify them or create new ones:
FRUITS$APPLE <- 99 # gives error
FRUITS$NEW <- 88 # gives error
I just faced this exact problem and could only find this SO question. The objectProperties package mention by Paul seems abandoned (it produces several warnings) and has lots of overhead for such a simple (in principle) problem. I came up with the following lightweight solution (depends only on the stringi package), which reproduces the feel of Enums in C languages. Maybe this helps someone.
EnumTest <- function(colorEnum = ColorEnum$BLUE) {
enumArg <- as.character(match.call()[2])
match.arg(enumArg, stringi::stri_c("ColorEnum$", names(ColorEnum)))
sprintf("%s: %i",enumArg,colorEnum)
}
ColorEnum <- list(BLUE = 0L, RED = 1L, BLACK = 2L)
Here is a simple method which supports enums with assigned values or which use the name as the value by default:
makeEnum <- function(inputList) {
myEnum <- as.list(inputList)
enumNames <- names(myEnum)
if (is.null(enumNames)) {
names(myEnum) <- myEnum
} else if ("" %in% enumNames) {
stop("The inputList has some but not all names assigned. They must be all assigned or none assigned")
}
return(myEnum)
}
If you are simply trying to make a defined list of names and don't care about the values you can use like this:
colors <- makeEnum(c("red", "green", "blue"))
If you wish, you can specify the values:
hexColors <- makeEnum(c(red="#FF0000", green="#00FF00", blue="#0000FF"))
In either case it is easy to access the enum names because of code completion:
> hexColors$green
[1] "#00FF00"
To check if a variable is a value in your enum you can check like this:
> param <- hexColors$green
> param %in% hexColors
Update 07/21/2017: I have created a package for enumerations in R:
https://github.com/aryoda/R_enumerations
If you want to use self-defined enum-alike data types as arguments of R functions that support
automatic translation of enum item names to the corresponding integer values
code auto completion (e. g. in RStudio)
clear documentation in the function signature which values are allowed as actual function parameters
easy validation of the actual function parameter against the allowed (integer) enum item values
you can define your own match.enum.arg function, e. g.:
match.enum.arg <- function(arg, choices) {
if (missing(choices)) {
formal.args <- formals(sys.function(sys.parent()))
choices <- eval(formal.args[[as.character(substitute(arg))]])
}
if(identical(arg, choices))
arg <- choices[[1]][1] # choose the first value of the first list item
allowed.values <- sapply(choices,function(item) {item[1]}) # extract the integer values of the enum items
if(!is.element(arg, allowed.values))
stop(paste("'arg' must be one of the values in the 'choices' list:", paste(allowed.values, collapse = ", ")))
return(arg)
}
Usage:
You can then define and use your own enums like this:
ColorEnum <- list(BLUE = 1L, RED = 2L, BLACK = 3L)
color2code = function(enum = ColorEnum) {
i <- match.enum.arg(enum)
return(i)
}
Example calls:
> color2code(ColorEnum$RED) # use a value from the enum (with auto completion support)
[1] 2
> color2code() # takes the first color of the ColorEnum
[1] 1
> color2code(3) # an integer enum value (dirty, just for demonstration)
[1] 3
> color2code(4) # an invalid number
Error in match.enum.arg(enum) :
'arg' must be one of the values in the 'choices' list: 1, 2, 3