Enum-like arguments in R - r

I'm new to R and I'm currently trying to supply the enumeration-like argument to the R function (or the RC/R6 class method), I currently use character vector plus match.arg similar to the following:
EnumTest = function(enum = c("BLUE", "RED", "BLACK")) {
enumArg <-
switch(
match.arg(enum), "BLUE" = 0L, "RED" = 1L, "BLACK" = 2L
)
switch(enumArg,
# do something
)
}
Is there are better/more concise way to imitate enum-like behavior in R? E.g. one big problem that user has to know the set of possible values for the argument and manually type it as a string - without any suggestion or auto-completion...
If there is no other better way, one thing that could improve above approach - it'd be nice to make it more concise by say predefining enums globally or say as private members of R6 class:
Color <- c("BLUE", "RED", "BLACK")
Then one could (re)use it in one or more function definitions, e.g.:
EnumTest = function(enum = Color) {
...
However, I'm not sure how to use this Color vector in match.arg function. It'd be nice if I could define Color as a map with keys being actual color values and values being integer representation - but I'm not sure how sensible that is.. Anyways, maybe there are more common neat approaches exist.
The main goal would be to provide an easy-to-use intuitive interface to the user of my package and functions (e.g. easy way to find the set of possible values, tab-completion, auto-suggestion, etc..), followed by standardized development of such functions using enum-like arguments

How about using a function that defines the enum by returning list(a= "a", ...)? You can then either assign the returned vector to a variable and use it in context, or use the function directly. Either a name or an integer reference will work as an index, although you have to use the unlist version of the index lookup, [[, otherwise you get a list with one element.
colorEnum <- function() {
list(BLUE = "BLUE", RED = "RED", BLACK = "BLACK")
}
colorEnum()$BLUE
#> [1] "BLUE"
colorEnum()[[1]]
#> [1] "BLUE"
colorEnum()[1]
#> $BLUE
#> [1] "BLUE"
col <- colorEnum()
col$BLUE
#> [1] "BLUE"
col[[1]]
#> [1] "BLUE"
col$BAD_COLOR
#> NULL
col[[5]]
#> Error in col[[5]] : subscript out of bounds
You can get the list of names for use in a match, i.e. your function parameter could be
EnumTest = function( enum = names(colorEnum()) { ...
You can actually abbreviate too, but it must be unique. (If you use RStudio, since col is a list, it will suggest completions!)
col$BLA
#> [1] "BLACK"
col$BL
#> NULL
If you want more sophisticated enum handling, you could assign S3 classes to the thing returned by your enum constructor function and write a small collection of functions to dispatch on class "enum" and allow case-insensitive indexing. You could also add special functions to work with a specific class, e.g. "colorEnum"; I have not done that here. Inheritance means the list access methods all still work.
colorEnum2 <- function() {
structure(
list(BLUE = "BLUE", RED = "RED", BLACK = "BLACK"),
class= c("colorEnum2", "enum", "list")
)
}
# Note, changed example to allow multiple returned values.
`[.enum` <- function(x, i) {
if ( is.character( i ))
i <- toupper(i)
class(x) <- "list"
names(as.list(x)[i])
}
`[[.enum` <- function(x, i, exact= FALSE) {
if ( is.character( i ))
i <- toupper(i)
class(x) <- "list"
as.list(x)[[i, exact=exact]]
}
`$.enum` <- function(x, name) {
x[[name]]
}
col <- colorEnum2()
# All these return [1] "RED"
col$red
col$r
col[["red"]]
col[["r"]]
col["red"]
col[c("red", "BLUE")]
#> [1] "RED" "BLUE"
col["r"]
[1] NA # R does not matches partial strings with "["
These override the built in [, [[ and $ functions when the thing being indexed is of class "enum", for any "enum" classed objects. If you need another one, you just need to define it.
directionEnum <- function() {
structure(
list(LEFT = "LEFT", RIGHT = "RIGHT"),
class= c("directionEnum", "enum", "list")
)
}
directionEnum()$l
#> [1] "LEFT"
If you need several enum objects, you could add a factory function enum that takes a vector of strings and a name and returns an enum object. Most of this is just validation.
enum <- function(enums, name= NULL) {
if (length(enums) < 1)
stop ("Enums may not be empty." )
enums <- toupper(as.character(enums))
uniqueEnums <- unique(enums)
if ( ! identical( enums, uniqueEnums ))
stop ("Enums must be unique (ignoring case)." )
validNames <- make.names(enums)
if ( ! identical( enums, validNames ))
stop( "Enums must be valid R identifiers." )
enumClass <- c(name, "enum", "list")
obj <- as.list(enums)
names(obj) <- enums
structure( obj, class= enumClass)
}
col <- enum(c("BLUE", "red", "Black"), name = "TheColors")
col$R
#> [1] "RED"
class(col)
#> [1] "TheColors" "enum" "list"
side <- enum(c("left", "right"))
side$L
#> [1] "LEFT"
class(side)
#> [1] "enum" "list"
But now this is starting to look like a package...

I like to use environments as replacement for enums because you can lock them to prevent any changes after creation. I define my creation function like this:
Enum <- function(...) {
## EDIT: use solution provided in comments to capture the arguments
values <- sapply(match.call(expand.dots = TRUE)[-1L], deparse)
stopifnot(identical(unique(values), values))
res <- setNames(seq_along(values), values)
res <- as.environment(as.list(res))
lockEnvironment(res, bindings = TRUE)
res
}
Create a new enum like this:
FRUITS <- Enum(APPLE, BANANA, MELON)
We can the access the values:
FRUITS$APPLE
But we cannot modify them or create new ones:
FRUITS$APPLE <- 99 # gives error
FRUITS$NEW <- 88 # gives error

I just faced this exact problem and could only find this SO question. The objectProperties package mention by Paul seems abandoned (it produces several warnings) and has lots of overhead for such a simple (in principle) problem. I came up with the following lightweight solution (depends only on the stringi package), which reproduces the feel of Enums in C languages. Maybe this helps someone.
EnumTest <- function(colorEnum = ColorEnum$BLUE) {
enumArg <- as.character(match.call()[2])
match.arg(enumArg, stringi::stri_c("ColorEnum$", names(ColorEnum)))
sprintf("%s: %i",enumArg,colorEnum)
}
ColorEnum <- list(BLUE = 0L, RED = 1L, BLACK = 2L)

Here is a simple method which supports enums with assigned values or which use the name as the value by default:
makeEnum <- function(inputList) {
myEnum <- as.list(inputList)
enumNames <- names(myEnum)
if (is.null(enumNames)) {
names(myEnum) <- myEnum
} else if ("" %in% enumNames) {
stop("The inputList has some but not all names assigned. They must be all assigned or none assigned")
}
return(myEnum)
}
If you are simply trying to make a defined list of names and don't care about the values you can use like this:
colors <- makeEnum(c("red", "green", "blue"))
If you wish, you can specify the values:
hexColors <- makeEnum(c(red="#FF0000", green="#00FF00", blue="#0000FF"))
In either case it is easy to access the enum names because of code completion:
> hexColors$green
[1] "#00FF00"
To check if a variable is a value in your enum you can check like this:
> param <- hexColors$green
> param %in% hexColors

Update 07/21/2017: I have created a package for enumerations in R:
https://github.com/aryoda/R_enumerations
If you want to use self-defined enum-alike data types as arguments of R functions that support
automatic translation of enum item names to the corresponding integer values
code auto completion (e. g. in RStudio)
clear documentation in the function signature which values are allowed as actual function parameters
easy validation of the actual function parameter against the allowed (integer) enum item values
you can define your own match.enum.arg function, e. g.:
match.enum.arg <- function(arg, choices) {
if (missing(choices)) {
formal.args <- formals(sys.function(sys.parent()))
choices <- eval(formal.args[[as.character(substitute(arg))]])
}
if(identical(arg, choices))
arg <- choices[[1]][1] # choose the first value of the first list item
allowed.values <- sapply(choices,function(item) {item[1]}) # extract the integer values of the enum items
if(!is.element(arg, allowed.values))
stop(paste("'arg' must be one of the values in the 'choices' list:", paste(allowed.values, collapse = ", ")))
return(arg)
}
Usage:
You can then define and use your own enums like this:
ColorEnum <- list(BLUE = 1L, RED = 2L, BLACK = 3L)
color2code = function(enum = ColorEnum) {
i <- match.enum.arg(enum)
return(i)
}
Example calls:
> color2code(ColorEnum$RED) # use a value from the enum (with auto completion support)
[1] 2
> color2code() # takes the first color of the ColorEnum
[1] 1
> color2code(3) # an integer enum value (dirty, just for demonstration)
[1] 3
> color2code(4) # an invalid number
Error in match.enum.arg(enum) :
'arg' must be one of the values in the 'choices' list: 1, 2, 3

Related

Putting initial values into an R6 Singleton

I would like to store initial values in a Singleton in R, so I am using the R6 Singleton class. The examples show how to manipulate internal variables, but I cannot find a way to pass in values. I tried
library( R6 )
library( R6P )
Parameters <- R6::R6Class("Parameters", inherit = R6P::Singleton, public = list(
elements = NA,
initialize = function(...) { tmp<-list(...);if (length(tmp)>0) {self$elements<-tmp} },
retrieve = function(){ self$elements }
))
My goal is that if I create one instance with initial values, then subsequent instances will find the same value. When I run the above, however, I get:
> p <- Parameters$new( "a" )
> p$retrieve()
[[1]]
[1] "a"
> q <- Parameters$new()
> q$retrieve()
[1] NA
How can I get the second instance of the Singleton, q, to return the initialized value "a"?
------ Edited using #eduardokapp suggestion -------
(First, I don't know enough about SO rules to know whether to post an edit, post my own answer or ask a new question so I chose to edit the initial question. Apologies if that is wrong.)
Looking at the link to Hadley Wickham's "Advanced R" chapter, this edit gives me the same object in the "elements" variable like a Singleton:
library( pryr )
library( R6 )
library( R6P )
TemporaryFile <- R6Class("TemporaryFile", list(
path = NULL,
initialize = function(...) {
self$path <- list(...)
},
finalize = function() {
message("Cleaning up ", self$path)
unlink(self$path)
}
))
Parameters <- R6::R6Class("Parameters", inherit = R6P::Singleton, public = list(
name = NULL,
elements = TemporaryFile$new(),
initialize = function(name) { self$name = name },
retrieve = function(){ address(elements) }
))
which when I run gives:
> p <- Parameters$new( "a" )
> q <- Parameters$new( "b" )
> p$name
[1] "a"
> q$name
[1] "b"
> p$retrieve() == q$retrieve()
[1] TRUE
>
The problem now is how to assign an input value in "TemporaryFile$new()". Anything I try to put in ("path","...") gives an error. How do you assign a value to the internal class so that you can store a value on the first instantiation that will be shared by all subsequent instantiations? Am I just making this more complicated than it needs to be?
Two tricks. The first was to set the preserved value after construction of the instance. The second was to use an 'ifelse' function when assigning the preserved 'value'. Otherwise, it is written over at construction.
library( R6 )
library( R6P )
Parameters <- R6::R6Class("Parameters", inherit = R6P::Singleton, public = list(
input = NA,
value = ifelse( exists("self$value"), value, NA ),
assign = function(input) {self$value = input}
))
Test
> library(pry)
> p <- Parameters$new()
> p$assign(input="Hi")
> p$value
[1] "Hi"
> q <- Parameters$new()
> q$value
[1] "Hi"
> p$value
[1] "Hi"
> identical(address(p),address(q))
[1] TRUE
Every new instance of Parameters will hold the same 'value'.

Replacement function as R6 class member function

I have been playing around with R6 ab bit and tried to implement a replacement function (similar in spirit to base::`diag<-`()). I wasn't hugely surprised to learn that the following does not work
library(R6)
r6_class <- R6Class("r6_class",
public = list(
initialize = function(x) private$data <- x,
elem = function(i) private$data[i],
`elem<-` = function(i, val) private$data[i] <- val
),
private = list(
data = NULL
)
)
test <- r6_class$new(1:5)
test$elem(2)
#> [1] 2
test$elem(2) <- 3
#> Error in test$elem(2) <- 3 :
#> target of assignment expands to non-language object
What does this correspond to in prefix notation? All of the following work as expected, so I guess it's none of these
test$`elem<-`(2, 3)
`$`(test, "elem<-")(2, 3)
I'm less interested in possible workarounds, but more in understanding why the above is invalid.
You are allowed to have nested complex assignments, e.g.
names(x)[3] <- "c"
but
test$elem(2) <- 3
is not of that form. It would be legal syntax as
elem(test,2) <- 3
which would expand to
*tmp* <- test
test <- `elem<-`(*tmp*, 2, 3)
but in the original form it would have to expand to
*tmp* <- 2
2 <- `test$elem<-`(*tmp*, 3)
(I've used test$elem<- in backticks to suggest it's the assignment version of the function returned by test$elem. That's not really right, there is no such thing.) The main problem is that the object being modified is 2, so you get the error message you saw: you're not allowed to modify 2.
If you want to do this in R6, I think you could do it something like this. Define a global function
`elem<-` <- function(x, arg, value) x$`elem<-`(arg, value)
and change the definition of your class elem<- method to
`elem<-` = function(i, val) { private$data[i] <- val; self }
Not all that convenient to need two definitions for every assignment method, but it appears to work.

Error in using match.arg for multiple arguments

I am new to using match.arg for default value specification in R functions. And I have a query regarding the below behavior.
trial_func <- function(a=c("1","9","20"),b=c("12","3"),d=c("55","01")){
a <- match.arg(a)
b <- match.arg(b)
d <- match.arg(d)
list(a,b,d)
}
trial_func()
# [[1]]
# [1] "1"
#
# [[2]]
# [1] "12"
#
# [[3]]
# [1] "55"
When I try using match.arg for each individual argument, it works as expected. But when I try to use an lapply to reduce the code written, it causes the below issue.
trial_func_apply <- function(a=c("1","9","20"),b=c("12","3"),d=c("55","01")){
lapply(list(a,b,d), match.arg)
}
trial_func_apply()
Error in FUN(X[[i]], ...) : 'arg' must be of length 1
Am I missing something here?
It's an old question, but I feel it's a great one, so I will try to provide extensive explanation for it by explaining the following:
Read the relevant documentation for ?match.arg
Make match.arg fail to guess the choices
Learn three features of the R language that match.arg uses underneath.
Simplified match.arg implementation
Make the lapply example of the question work
match.arg documentation
The usage tells you that match.arg needs the selected option you want to match (arg) and all the possible choices:
match.arg(arg, choices, several.ok = FALSE)
If we read choices, we see that it can often be missing, and we should read more in the details... How could match.arg work without having the possible choices, we wonder?
choices: a character vector of candidate values, often missing, see
‘Details’.
Maybe the Details section gives some hints (bold is mine):
Details:
In the one-argument form ‘match.arg(arg)’, the choices are
obtained from a default setting for the formal argument ‘arg’ of
the function from which ‘match.arg’ was called. (Since default
argument matching will set ‘arg’ to ‘choices’, this is allowed as
an exception to the ‘length one unless ‘several.ok’ is ‘TRUE’’
rule, and returns the first element.)
So, if you don't specify the choices argument, R will make a bit of effort to guess it right automagically. For the R magic to work, several conditions must be fulfilled:
The match.arg function must be called directly from the function with the argument
The name of the variable to be matched must be the name of the argument.
match.arg() can be tricked:
Let's make match.arg() fail to guess the choices:
dummy_fun1 <- function(x = c("a", "b"), y = "c") {
# If you name your argument like another argument
y <- x
# The guessed choices will correspond to y (how could it know they were x?)
wrong_choices <- match.arg(y)
}
dummy_fun1(x = "a")
# Error in match.arg(y) : 'arg' should be “c”
dummy_fun2 <- function(x = c("a", "b"), y = "c") {
# If you name your argument differently
z <- x
# You don't get any guess:
wrong_choices <- match.arg(z)
}
dummy_fun2(x="a")
#Error in match.arg(z) : 'arg' should be one of
Three R language features that match.arg needs and uses
(1) It uses non-standard evaluation to get the name of the variable:
whats_the_var_name_called <- function(arg) {
as.character(substitute(arg))
}
x <- 3
whats_the_var_name_called(x)
# "x"
y <- x
whats_the_var_name_called(y)
# "y"
(2) It uses sys.function() to get the caller function:
this_function_returns_its_caller <- function() {
sys.function(1)
}
this_function_returns_itself <- function() {
me <- this_function_returns_its_caller()
message("This is the body of this_function_returns_itself")
me
}
> this_function_returns_itself()
This is the body of this_function_returns_itself
function() {
me <- this_function_returns_its_caller()
message("This is the body of this_function_returns_itself")
me
}
(3) It uses formals() to get the possible values:
a_function_with_default_values <- function(x=c("a", "b"), y = 3) {
}
formals(a_function_with_default_values)[["x"]]
#c("a", "b")
How does match.arg work?
Combining these things, match.arg uses substitute() to get the name of the args variable, it uses sys.function() to get the caller function, and it uses formals() on the caller function with the argument name to get the default values of the function (the choices):
get_choices <- function(arg, choices) {
if (missing(choices)) {
arg_name <- as.character(substitute(arg))
caller_fun <- sys.function(1)
choices_as_call <- formals(caller_fun)[[arg_name]]
choices <- eval(choices_as_call)
}
choices
}
dummy_fun3 <- function(x = c("a", "b"), y = "c") {
get_choices(x)
}
dummy_fun3()
#[1] "a" "b"
Since we now know the magic used to get the choices, so we can create our match.arg implementation:
my_match_arg <- function(arg, choices) {
if (missing(choices)) {
arg_name <- as.character(substitute(arg))
caller_fun <- sys.function(1)
choices_as_call <- formals(caller_fun)[[arg_name]]
choices <- eval(choices_as_call)
}
# Really simple and cutting corners... but you get the idea:
arg <- arg[1]
if (! arg %in% choices) {
stop("Wrong choice")
}
arg
}
dummy_fun4 <- function(x = c("a", "b"), y = "c") {
my_match_arg(x)
}
dummy_fun4(x="d")
# Error in my_match_arg(x) : Wrong choice
dummy_fun4(x="a")
# [1] "a"
And that's how match.arg works.
Why it does not work under lapply? How to fix it?
To guess the choices argument, we look at the caller argument. When we use match.arg() inside an lapply call, the caller is not our function, so match.arg fails to guess the choices. We can get the choices manually and provide the choices manually:
trial_func_apply <- function(a=c("1","9","20"),b=c("12","3"),d=c("55","01")){
this_func <- sys.function()
the_args <- formals(this_func)
default_choices <- list(
eval(the_args[["a"]]),
eval(the_args[["b"]]),
eval(the_args[["d"]])
)
# mapply instead of lapply because we have two lists we
# want to apply match.arg to
mapply(match.arg, list(a,b,d), default_choices)
}
trial_func_apply()
# [1] "1" "12" "55"
Please note that I am cutting corners by not defining the environments where all the evals should happen, because in the examples above they work as-is. There may be some corner cases that make this examples to fail, so don't use them in production.
After investigating a bit, you need to pass the argument that your character vector is NULL, i.e.
trial_func_apply <- function(a=c("1","9","20"),b=c("12","3"),d=c("55","01")){
lapply(list(a,b,d), function(i)match.arg(NULL, i))
}
trial_func_apply()
#[[1]]
#[1] "1"
#[[2]]
#[1] "12"
#[[3]]
#[1] "55"

Is it possible to map one string to another using a dictionary-like object in R?

I want to map one string to another in R, using a dictionary-like object as seen in Python. For example, in Python, you can define a dictionary to convert one string to another, like:
d = {"s": "Superlative", "d": "Dynamic", "f": "Furious"}
pd.apply(lambda x: d[x["map_column"]], axis=1)
However, in R, if you want to convert a set of strings in one column to another one based on such mapping, you would end up defining a function that takes a lot of if else, like:
mapper <- function(x) {
if (x = "s") {
return ("Superlative")
} else if (x = "d") {
return ("Dynamic")
}...
return("")
}
But I don't like to define such a long, long function. So is it possible to define a dictionary, or more specifically, to get the result with far fewer, one-line (or two) code in R?
If you're already using the tidyverse, then this is what recode is for in dplyr.
df %>%
mutate(LongName = recode(ShortName,
s = "Superlative",
d = "Dynamic",
f = "Furious",
`multiple words` = "Use backticks to escape"
)
) ->
df
You can use an environment as a hash table:
dict <- new.env(hash = TRUE, parent = emptyenv(), size = NA)
## insert into hash table
dict[["s"]] <- "Superlative"
dict[["d"]] <- "Dynamic"
dict[["f"]] <- "Furious"
## query using key
key <- "s"
dict[[key]]
##[1] "Superlative"
key2 <- "f"
dict[[key2]]
##[1] "Furious"
I found that this is easily achieved by defining a vector with an attached string, like:
map <- c("s"="Superlative", "d"="Dynamic", "f"="Furious")
df$longCharacter <- map(df$shortCharacter)
This is executed much like a dictionary in Python, in one-line code.

Call more then one slot or fields in S4 or Reference Classes

Is it possible to call or set values for more then one slot?
A<-setClass(Class="A",slot=c(name="character",type="character"))
a<-A()
slot(object,c("name","type"),check=T)
Do I have to write own getSlot and setSlot methods? And how to that in R5?
AB <- setRefClass("AB", fields=c(name="character"),
methods=list(getName=AB.getName)
)
AB.getName<-function(object){
object$name
}
a<-AB(name="abc")
AB.getName(a)
This answer applies to reference classes.
Let's start with the simplest definition of AB, without any methods.
AB <- setRefClass(
"AB",
fields = list(
name = "character"
)
)
You can retrieve the value of the name field in the same way you would a list.
ab <- AB$new(name = "ABC")
ab$name
## [1] "ABC"
(ab$name <- "ABCD")
## [1] "ABCD"
It is possible to autogenerate accessor methods to get and set the name field.
AB$accessors("name")
ab$getName()
ab$setName("ABCDE")
This is really pointless though since it has the exactly same behaviour as before, but with more typing. What can be useful is to do input checking (or other custom behaviour) when you set a field. To do this, you can add a setName method that you write yourself.
AB$methods(
setName = function(x)
{
if(length(x) > 1)
{
warning("Only using the first string.")
x <- x[1]
}
name <<- x
}
)
ab$setName(letters)
## Warning message:
## In ab$setName(letters) : Only using the first string.
It is also possible (and usually more useful) to define this method when you assign the reference class template.
AB <- setRefClass(
"AB",
fields = list(
name = "character"
),
methods = list(
setName = function(x)
{
if(length(x) > 1)
{
warning("Only using the first string.")
x <- x[1]
}
name <<- x
}
)
)
Response to comment:
Yes that works, but:
getFieldNames is more maintainable if implemented as names(AB$fields()).
When defining fields in setRefClass, use a list. For example, list(name="character", var2="character").
When assigning an instance of a reference class, use new. For example, AB$new(name="abc",var2="abc")
In S4, the default initialize method allows one to write
A <- setClass(Class="A", slot=c(name="character",type="character"))
a <- A(name="abc", type="def")
initialize(a, name="cde", type="fgh")
Your own initialize methods (if any -- I think it's usually best to avoid them) have to be written to allow for this use. There is no default way to convert an S4 representation to a list.
You could incorporate these ideas into your own generics / methods with something like
setGeneric("values", function(x, ...) standardGeneric("values"))
setMethod("values", "A", function(x, ...) {
slts = slotNames(x)
lapply(setNames(slts, slts), slot, object=x)
})
setGeneric("values<-", function(x, ..., value) standardGeneric("values<-"))
setReplaceMethod("values", c(x="A", value="list"), function(x, ..., value) {
do.call("initialize", c(x, value))
})
with
> a <- A(name="abc", type="def")
> values(a) = list(name="cde", type="fgh")
> values(a)
$name
[1] "cde"
$type
[1] "fgh"

Resources