Selectively import functions using regex - r

I am looking to import only a few functions from a package. Based on this issue, I can use #rawNamespace to import except.
However, what I would like to do is close to this answer. I would like to define a regular expression to only import certain functions automatically. I would like to avoid importing an entire package just for a couple of functions.
Example
#' My fancy function
#' #rawNamespace import(ggplot2, except = scale_fill_manual)
#' #export
hello_world <- function(){
print("Hello World!")
}
In the above example, I would like to do something like:
#' My fancy function
#' #rawNamespace import(ggplot2, include = scale_*)
#' #export
hello_world <- function(){
print("Hello World!")
}
The above example is super basic but I will actually use the imported functions somewhere else. I cannot simply use :: accessors as I am programmatically getting the functions from the namespace.

Based on this answer, my current workaround is:
lapply(Filter(function(x) grepl("scale_", x), getNamespaceExports("ggplot2")),
utils::getFromNamespace, "ggplot2")
The above will allow me to import all ggplot2 scale functions while only necessitating that I specify a utils import in the Description. However, I think that this may be less ideal since perhaps it requires ggplot2 or whatever package to be on the search path.
This also is flawed because then I need to add names to the list to be able to figure out which function is which.

Related

Explicit namespaces for non-exported functions within same R package - best practice

I have an R package (MyPackage) that has some exported (using #export) and some non-exported functions. If I call a non-exported function from elsewhere in the package, what is the most appropriate way to reference it? For example, given the following code:
#' #export
f1 <- function(){
f2()
}
f2 <- function(){
print('hello')
}
When I run linting on the package I get the warning:
no visible global function definition for 'f2'
I could use MyPackage:f2 but my understanding was that this isn't necessary. I do not expect to get the error 'no visible global function definition' for a function within the same package. What is the best practice in this case?
As myself and others have mentioned, your code does not produce such a warning.
In terms of best practices, don't use MyPackage:::f2. As mentioned here.
A function is not exported to user is not present in the NAMESPACE. If
you use roxygen, putting #export tag ONLY for the function you want to
export will do the job.
As you did, just using the #export tag for the functions you want to make available, and not for the internal functions, is the way to go. You should just decorate your internal functions with a few roxygen comments, then decide whether you want to create a manual page for this function or not.
If you don't wish to create a manual page for f2, you should use the #' #noRd tag. (source)
#' Internal function printing "hello"
#' #description A function that prints the text "hello".
#' #noRd
f2 <- function(){
print('hello')
}
If you wanted to create a manual page for f2, but to exclude it from the index of the manual, you could use #keywords internal, which works even with f1 or basically any function that you wouldn't want too visible in your manual.
#' Internal function printing "hello"
#' #description A function that prints the text "hello".
#' #keywords internal
f2 <- function(){
print('hello')
}

How can I source specific functions in an R script?

I have a script with my most commonly used functions which I source at the top of most scripts. Sometimes I only want to get one of the functions in that script, but I don't know how to indicate that I only want one specific function. I'm looking for a function that is similar to the :: used to get a function inside a package. A reproducible example:
# file a.R
foo <- function() cat("Hello!\n")
bar <- function() cat("Goodbye!\n")
# End of file a.R
# file b.R
# Can't just delete all functions
fun <- function(x) print(x)
fun("It's so late!")
source("a.R")
foo()
fun("See you next time")
# End of file
I read the "source" help and it was unhelpful to me. The solution I currently have is to assign a variable at the start of the script with the functions loaded before, then set the difference with what was there after:
list_before <- lsf.str()
# content of file b.R
new_funcs <- setdiff(lsf.str(),list_before)
Then I can use rm(list=new_funcs[-1]) to keep only the function I wanted. This is, however a very convoluted way of doing this and I was hoping to find an easier solution.
A good way would be to write a package but it requires more knowledge (not there myself).
A good alternative I found is to use the package box that always you to import functions from an R script as a module.
You can import all functions or specific functions.
To set up a function as a module, you would use the roxygen2 documentation syntax as such:
#' This is a function to calculate a sum
#' #export
my_sum <- function(x, y){
x + y
}
#' This is a function to calculate a difference
#' #export
my_diff <- function(x, y){
x - y
}
Save the file as an R script "my_module.R"
The export parameter in the documentation tells box that what follows is a module. Then you can call box to reach a specific function in the module named "my_module".
Let's say your project directory has a script folder that contains your scripts and modules, you would import functions as such:
box::use(script/my_module)
my_module$my_sum(x, y)
box::use() creates an environment that contains all the functions found inside the module.
You can also import single functions like as follows. Let's assume your directory is a bit more complex as well where modules are inside a box folder inside script.
box::use(./script/box/my_module[my_sum])
my_sum(x, y)
You can use box to fetch functions from packages as well. In a sense, it is better than calling library() that would import all the functions in the package.
Using box, you can organize script by objectives or whatever organization you have in place.
I have a script to deal with strings from which I fetch function that work with strings.
I have a script for plot functions that I use in my projects...etc
insertSource() would help.
In your example, let's presume we need to import foo() from a.R :
# file b.R
foo <- function(){}
insertSource("a.R", functions = "foo", force=T)
foo <- foo#.Data

R Package: how "import" works when my exported function does not call explicitly a function from other packages, but a subroutine does

I am developing my first R package and there is something that it is not clear to me about Imports in the DESCRIPTION file. I went through quite some guides that explain package structure but I do not find an answer to my question, so here is my situation.
I define a function f that I will export, so its definition will have the proper #export roxygen comment on top.
now, my function f calls a subroutine hidden, that I do not want to export. Function hidden uses other packages too, say package X.
Because the call to X is inside function hidden, there is no tag #import X in my function f. Thus, I added package X to the Imports in my DESCRIPTION file, hoping to specify the relevant dependency there.
When I use devtools::document(), however, the generated NAMESPACE does not contain an entry for X. I can see why that happens: the parser just does not find the flag in the roxygen comment for f, and at runtime a call to f crashes because X is missing.
Now, I can probably fix everything by specifying X in the import of f. But why is the mechanism this tricky? Or, similarly, why my imports in DESCRIPTION do not match the ones in NAMESPACE?
My understanding is that there are three "correct" ways to do the import. By "correct," I mean that they will pass CRAN checks and function properly. Which option you choose is a matter of balancing various advantages and is largely subjective.
I'll review these options below using the terminology
primary_function the function in your package that you wish to export
hidden the unexported function in your package used by primary_function
thirdpartypkg::blackbox, blackbox is an exported function from the thirdpartypkg package.
Option 1 (no direct import / explicit function call)
I think this is the most common approach. thirdpartypkg is declared in the DESCRIPTION file, but nothing is imported from thirdpartypkg in the NAMESPACE file. In this option, it is necessary to use the thirdpartypkg::blackbox construct to get the desired behavior.
# DESCRIPTION
Imports: thirdpartypkg
# NAMESPACE
export(primary_function)
#' #name primary_function
#' #export
primary_function <- function(x, y, z){
# do something here
hidden(a = y, b = x, z = c)
}
# Unexported function
#' #name hidden
hidden <- function(a, b, c){
# do something here
thirdpartypkg::blackbox(a, c)
}
Option 2 (direct import / no explicit function call)
In this option, you directly import the blackbox function. Having done so, it is no longer necessary to use thirdpartypkg::blackbox; you may simply call blackbox as if it were a part of your package. (Technically it is, you imported it to the namespace, so there's no need to reach to another namespace to get it)
# DESCRIPTION
Imports: thirdpartypkg
# NAMESPACE
export(primary_function)
importFrom(thirdpartypkg, blackbox)
#' #name primary_function
#' #export
primary_function <- function(x, y, z){
# do something here
hidden(a = y, b = x, z = c)
}
# Unexported function
#' #name hidden
#' #importFrom thirdpartypkg blackbox
hidden <- function(a, b, c){
# do something here
# I CAN USE blackbox HERE AS IF IT WERE PART OF MY PACKAGE
blackbox(a, c)
}
Option 3 (direct import / explicit function call)
Your last option combines the the previous two options and imports blackbox into your namespace, but then uses the thirdpartypkg::blackbox construct to utilize it. This is "correct" in the sense that it works. But it can be argued to be wasteful and redundant.
The reason I say it is wasteful and redundant is that, having imported blackbox to your namespace, you're never using it. Instead, you're using the blackbox in the thirdpartypkg namespace. Essentially, blackbox now exists in two namespaces, but only one of them is ever being used. Which begs the question of why make the copy at all.
# DESCRIPTION
Imports: thirdpartypkg
# NAMESPACE
export(primary_function)
importFrom(thirdpartypkg, blackbox)
#' #name primary_function
#' #export
primary_function <- function(x, y, z){
# do something here
hidden(a = y, b = x, z = c)
}
# Unexported function
#' #name hidden
#' #importFrom thirdpartypkg blackbox
hidden <- function(a, b, c){
# do something here
# I CAN USE blackbox HERE AS IF IT WERE PART OF MY PACKAGE
# EVEN THOUGH I DIDN'T. CONSEQUENTLY, THE blackbox I IMPORTED
# ISN'T BEING USED.
thirdpartypkg::blackbox(a, c)
}
Considerations
So which is the best approach to use? There isn't really an easy answer to that. I will say that Option 3 is probably not the approach to take. I can tell you that Wickham advises against Option 3 (I had been developing under that framework and he advised me against it).
If we make the choice between Option 1 and Option 2, the considerations we have to make are 1) efficiency of writing code, 2) efficiency of reading code, and 3) efficiency of executing code.
When it comes to the efficiency of writing code, it's generally easier to #importFrom thirdpartypkg blackbox and avoid having to use the :: operator. It just saves a few key strokes. This adversely affects readability of code, however, because now it isn't immediately apparent where blackbox comes from.
When it comes to efficiency of reading code, it's superior to omit #importFrom and use thirdpartypkg::blackbox. This makes it obvious where blackbox comes from.
When it comes to efficiency of executing code, it's better to #importFrom. Calling thirdpartypkg::blackbox is about 0.1 milliseconds slower than using #importFrom and calling blackbox. That isn't a lot of time, so probably isn't much of a consideration. But if your package uses hundreds of :: constructs and then gets thrown into looping or resampling processes, those milliseconds can start to add up.
Ultimately, I think the best guidance I've read (and I don't know where) is that if you are going to call blackbox more than a handful of times, it's worth using #importFrom. If you will only call it three or four times in a package, go ahead and use the :: construct.

S3 generic method not appearing in package manual

In my R package, a few functions are omitted from the package manual .pdf file - and they are all S3 methods where several functions are documented together. All other "normal" functions appear correctly, so I suspect I'm not documenting the S3 methods correctly.
I want an entry for myfun to appear in the manual. Right now, the function is missing from the .pdf manual entirely, though it can still be called correctly and its help page referenced with ?myfun. Are my Roxygen2 keywords wrong?
#' #export
myfun <- function(...) UseMethod("myfun")
#' #inheritParams myfun
#' #describeIn myfun Create a frequency table from a vector.
#' #export
#' #keywords internal
myfun.default <- function(vec, sort = FALSE, show_na = TRUE, ...) {
...
}
#' #inheritParams myfun.default
#' #describeIn myfun Create a frequency table from a data.frame,
#' supplying the unquoted name of the column to tabulate.
#' #export
#' #keywords internal
tabyl.data.frame <- function(.data, ...){
...
}
(I omitted the #title, #description, #param, #return, #examples lines to keep this question shorter but can edit them in if relevant).
The generic methods are exporting as intended, so that the user only sees myfun() and not myfun.default() or myfun.data.frame(), unless they use the triple colon :::. I'd like to retain that behavior, so the user just calls myfun, while also having an entry for myfun in the package manual.
I removed #keywords internal in the two myfun. methods and that did it: myfun appears in the package's manual. I also switched to #rdname myfun instead of #describeIn myfun, to eliminate the section "Methods (by class):" in the function's documentation.
What made this hard to isolate was that if I run devtools::document() and then don't restart the session, the methods myfun.data.frame and myfun.default are visible in RStudio's autocomplete and can be called directly. They are not supposed to be accessible to the user, and I thought my Roxygen2 documentation was to blame.
In fact, all I had to do was remove #keywords internal.
The methods myfun.data.frame and myfun.default do appear in the autocomplete after typing ?, e.g., ?myfun.default, but I think there's no way around that (and it directs to the single help page for all three myfun functions, anyway). This is standard. For example, ?print.aov is a visible help file while print.aov() cannot be called directly.

When does a package need to use ::: for its own objects

Consider this R package with two functions, one exported and the other internal
hello.R
#' #export
hello <- function() {
internalFunctions:::hello_internal()
}
hello_internal.R
hello_internal <- function(x){
print("hello world")
}
NAMESPACE
# Generated by roxygen2 (4.1.1): do not edit by hand
export(hello)
When this is checked (devtools::check()) it returns the NOTE
There are ::: calls to the package's namespace in its code. A package
almost never needs to use ::: for its own objects:
‘hello_internal’
Question
Given the NOTE says almost never, under what circumstances will a package need to use ::: for its own objects?
Extra
I have a very similar related question where I do require the ::: for an internal function, but I don't know why it's required. Hopefully having an answer to this one will solve that one. I have a suspicion that unlocking the environment is doing something I'm not expecting, and thus having to use ::: on an internal function.
If they are considered duplicates of each other I'll delete the other one.
You should never need this in ordinary circumstances. You may need it if you are calling the parent function in an unusual way (for example, you've manually changed its environment, or you're calling it from another process where the package isn't attached).
Here is a pseudo-code example, where I think using ::: is the only viable solution:
# R-package with an internal function FInternal() that is called in a foreach loop
FInternal <- function(i) {...}
#' Exported function containing a foreach loop
#' #export
ParallelLoop <- function(is, <other-variables>) {
foreach(i = is) %dopar% {
# This fails, because it cannot not locate FInternal, unless it is exported.
FInternal(i)
# This works but causes a note:
PackageName:::FInternal(i)
}
}
I think the problem here is that the body of the foreach loop is not defined as a function of the package. Hence, when executed on a worker process, it is not treated as a code belonging to the package and does not have access to the internal objects of the package. I would be glad if someone could suggest an elegant solution for this specific case.

Resources