I am trying to write a function (within a package) which involves allowing the user to specify a function from a specific package to execute and provide additional arguments for that function.
I can do this in two steps with rlang::call2 as follows:
# `my_pkg` = "foo" (user-input)
# `my_fun` = "bar" (user-input)
# `args` is a named list (user-input)
the_call <- rlang::call2(.fn = my_fun , !!!args , .ns = my_pkg)
base::eval(the_call)
I have a feeling that the "correct" way to do this is to just use rlang::exec, but rlang::exec does not have a .ns argument like rlang::call2 does. Instead, it has an .env argument.
How do you specify the package (or namespace) of a function to rlang::exec? The .env argument to rlang::exec doesn't appear to be the answer, because the user should be able to specify a function in a specific package (that is installed) without first loading that package.
Or is rlang::exec not meant for this purpose?
Try
real_fun <- get(my_fun, envir=as.environment(paste0("package:", my_pkg)))
rlang::exec(real_fun, !!!args)
Since "packagename::functionname" (string) cannot be passed to rlang::exec, we can pass the real function. This can be retrieved by using get and passing a particular environment (from a package namespace, as above).
(This doesn't alleviate the need for two expressions to effect the exec, but it does allow one to use exec directly.)
Related
I'd like to create shorter aliases for some R functions, like j=paste0. However when I add the line j=paste0 to ~/.Rprofile, it is later overridden when I use j as a variable name.
I was able to create my own package by first running package.skeleton() and then running this:
rm anRpackage/R/*
echo 'j=paste0'>anRpackage/R/test.R
echo 'export("j")'>anRpackage/NAMESPACE
rm -r anRpackage/man
R CMD install anRpackage
And then library(anRpackage);j=0;j("a",1) returns "a1". However is there some easier way to prevent the function definition from being overridden by the variable assignment?
The code in .Rprofile will run in the global environment so that's where the variable j will be defined. But if you use j as a variable later in the global environment, it will replace that value. Variables in a given environment must have unique names. But two different environments may have the same name defined and R will use the first version it finds that will work as a function when you attempt to call the variable name as a function. So basically you need to create a separate environment. You can do what with a package as you've done. You could also use attach() to add a new environment to your search path.
attach(list(j=paste0))
This will allow for the behavior you want.
attach(list(j=paste0))
j("a",1)
# [1] "a1"
j <- 5
j("a",1)
# [1] "a1"
Normally I would discourage people from using attach() because it causes confusing change to your workspace but this would do what you want. Just be careful if anyone else will be using your code because creating aliases for built in functions and reusing those as variable names is not a common thing to see in R scripts.
You can see where it was attached by looking at the output of search(). Normally it will be attached in the second position so you can remove it with detach(2)
I ended up putting my custom functions to ~/.R.R and I added these lines to .Rprofile:
if("myfuns"%in%search())detach("myfuns")
source("~/.R.R",attach(NULL,name="myfuns"))
From the help page of attach:
One useful ‘trick’ is to use ‘what = NULL’ (or equivalently a
length-zero list) to create a new environment on the search path
into which objects can be assigned by assign or load or
sys.source.
...
## create an environment on the search path and populate it
sys.source("myfuns.R", envir = attach(NULL, name = "myfuns"))
In a package, I would like to call an S3 method "compact" for object foobar.
There would therefore be a compact.foobar function in my package, along with the compact function itself:
compact = function(x, ...){
UseMethod("compact", x)
}
However, this latter would be conflicting with purrr::compact.
I could default the method to use purrr (compact.default = purrr::compact, or maybe
compact.list = purrr::compact), but that would make little sense if the user does not have purrr loaded.
How can I default my method to the loaded version of compact, in the user environment? (so that it uses purrr::compact, any other declared compact function, or fails of missing function)
Unfortunately S3 does not deal with this situation well. You have to search for suitable functions manually. The following works, more or less:
get_defined_function = function (name) {
matches = getAnywhere(name)
# Filter out invisible objects and duplicates
objs = matches$objs[matches$visible & ! matches$dups]
# Filter out non-function objects
funs = objs[vapply(objs, is.function, logical(1L))]
# Filter out function defined in own package.
envs = lapply(funs, environment)
funs = funs[! vapply(envs, identical, logical(1L), topenv())]
funs[1L][[1L]] # Return `NULL` if no function exists.
}
compact.default = function (...) {
# Maybe add error handling for functions not found.
get_defined_function('compact')(...)
}
This uses getAnywhere to find all objects named compact that R knows about. It then filters out those that are not visible because they’re not inside attached packages, and those that are duplicate (this is probably redundant, but we do it anyway).
Next, it filters out anything that’s not a function. And finally it filters out the compact S3 generic that our own package defines. To do this, it compares each function’s environment to the package environment (given by topenv()).
This should work, but it has no rule for which function to prefer if there are multiple functions with the same name defined in different locations (it just picks an arbitrary one first), and it also doesn’t check whether the function signature matches (doing this is hard in R, since function calling and parameter matching is very flexible).
In my code, I needed to check which package the function is defined from (in my case it was exprs(): I needed it from Biobase but it turned out to be overriden by rlang).
From this SO question, I thought I could use simply environmentName(environment(functionname)). But for exprs from Biobase that expression returned empty string:
environmentName(environment(exprs))
# [1] ""
After checking the structure of environment(exprs) I noticed that it has .Generic member which contains package name as an attribute:
environment(exprs)$.Generic
# [1] "exprs"
# attr(,"package")
# [1] "Biobase"
So, for now I made this helper function:
pkgparent <- function(functionObj) {
functionEnv <- environment(functionObj)
envName <- environmentName(functionEnv)
if (envName!="")
return(envName) else
return(attr(functionEnv$.Generic,'package'))
}
It does the job and correctly returns package name for the function if it is loaded, for example:
pkgparent(exprs)
# Error in environment(functionObj) : object 'exprs' not found
library(Biobase)
pkgparent(exprs)
# [1] "Biobase"
library(rlang)
# The following object is masked from ‘package:Biobase’:
# exprs
pkgparent(exprs)
# [1] "rlang"
But I still would like to learn how does it happen that for some packages their functions are defined in "unnamed" environment while others will look like <environment: namespace:packagename>.
What you’re seeing here is part of how S4 method dispatch works. In fact, .Generic is part of the R method dispatch mechanism.
The rlang package is a red herring, by the way: the issue presents itself purely due to Biobase’s use of S4.
But more generally your resolution strategy might fail in other situations, because there are other reasons (albeit rarely) why packages might define functions inside a separate environment. The reason for this is generally to define a closure over some variable.
For example, it’s generally impossible to modify variables defined inside a package at the namespace level, because the namespace gets locked when loaded. There are multiple ways to work around this. A simple way, if a package needs a stateful function, is to define this function inside an environment. For example, you could define a counter function that increases its count on each invocation as follows:
counter = local({
current = 0L
function () {
current <<- current + 1L
current
}
})
local defines an environment in which the function is wrapped.
To cope with this kind of situation, what you should do instead is to iterate over parent environments until you find a namespace environment. But there’s a simpler solution, because R already provides a function to find a namespace environment for a given environment (by performing said iteration):
pkgparent = function (fun) {
nsenv = topenv(environment(fun))
environmentName(nsenv)
}
I am building a package with functions that have default arguments. I would like to find a clean way to set the values of the default arguments once the function has been imported.
My first attempt was to set the default values to unknown objects (within the package). Then when I load the package, I would have a external script that would assign a value to the unknown objects.
But it does not seem very clean since I am compiling a function with an unknown object.
My issue is that I will need other people to use the functions, and since they have many arguments I want to keep the code as concise as possible. And many arguments can be set by a configuration script before running the program.
So, for instance, I define in my package:
function_try <- function(x = VAL){
return(x)
}
I compile the package and load it, and then I have an external script that does (or reading from a config file)
VAL <- "hello"
And then a user of the function can just run
function_try()
I would use options for that. So your function looks like:
function_try <- function(x = getOption("mypackage.default.value")) x
In your external script you make sure that the option value is set:
options(mypackage.default.value = "hello")
IMHO that is a clean solution, because anybody reading your function will see at first sight that a certain options value is used as a default and has also a clear understanding of how to overwrite this value if needed.
I would also define a fall back value in your library onLoad to make sure that the value is defined in the first place. You can then even react in your functions to this fallback value and issue a meaningful warning if the function realizes that the the external script did for whatever reason not provide a new value.
Is it possible in R to protect function names (or variables in general) so that they cannot be masked.
I recently spotted that this can be a problem when creating a data frame with the name "new", which masked a function used by lmer and thus stopped it working. (Recovery is easy once you know what the problem is, here "rm(new)" did it.)
There is an easy workaround for your problem, without worrying about protecting variable names (though playing with lockBinding does look fun). If a function becomes masked, as in your example, it is still possible to call the masked version, with the help of the :: operator.
In general, the syntax is packagename::variablename.
(If the function you want has not been exported from the package, then you need three colons instead, :::. This shouldn't apply in this case however.)
Maybe use environments! This is a great way to separate namespaces. For example:
> a <- new.env()
> assign('printer', function(x) print(x), envir=a)
> get('printer', envir=a)('test!')
[1] "test!"
#hdallazuanna recommends (via Twitter)
new <- 1
lockBinding('new', globalenv())
this makes sense when the variable is user created but does not, of course, prevent overwriting a function from a package.
I had the reverse problem from the OP, and I wanted to prevent my custom functions in .Rprofile from being overridden when I defined a variable with the same name as a function, but I ended up putting my functions to ~/.R.R and I added these lines to .Rprofile:
if("myfuns"%in%search())detach("myfuns")
source("~/.R.R",attach(NULL,name="myfuns"))
From the help page of attach:
One useful ‘trick’ is to use ‘what = NULL’ (or equivalently a
length-zero list) to create a new environment on the search path
into which objects can be assigned by assign or load or
sys.source.
...
## create an environment on the search path and populate it
sys.source("myfuns.R", envir = attach(NULL, name = "myfuns"))