When does a package need to use ::: for its own objects - r

Consider this R package with two functions, one exported and the other internal
hello.R
#' #export
hello <- function() {
internalFunctions:::hello_internal()
}
hello_internal.R
hello_internal <- function(x){
print("hello world")
}
NAMESPACE
# Generated by roxygen2 (4.1.1): do not edit by hand
export(hello)
When this is checked (devtools::check()) it returns the NOTE
There are ::: calls to the package's namespace in its code. A package
almost never needs to use ::: for its own objects:
‘hello_internal’
Question
Given the NOTE says almost never, under what circumstances will a package need to use ::: for its own objects?
Extra
I have a very similar related question where I do require the ::: for an internal function, but I don't know why it's required. Hopefully having an answer to this one will solve that one. I have a suspicion that unlocking the environment is doing something I'm not expecting, and thus having to use ::: on an internal function.
If they are considered duplicates of each other I'll delete the other one.

You should never need this in ordinary circumstances. You may need it if you are calling the parent function in an unusual way (for example, you've manually changed its environment, or you're calling it from another process where the package isn't attached).

Here is a pseudo-code example, where I think using ::: is the only viable solution:
# R-package with an internal function FInternal() that is called in a foreach loop
FInternal <- function(i) {...}
#' Exported function containing a foreach loop
#' #export
ParallelLoop <- function(is, <other-variables>) {
foreach(i = is) %dopar% {
# This fails, because it cannot not locate FInternal, unless it is exported.
FInternal(i)
# This works but causes a note:
PackageName:::FInternal(i)
}
}
I think the problem here is that the body of the foreach loop is not defined as a function of the package. Hence, when executed on a worker process, it is not treated as a code belonging to the package and does not have access to the internal objects of the package. I would be glad if someone could suggest an elegant solution for this specific case.

Related

Can our R package depend on an invisible `pkg:::function()` from another package [duplicate]

I am creating an R package in RStudio. Say I have two functions fnbig() and fnsmall() in my package named foo. fnbig() is a function that must be accessible to the user using the package. fnsmall() is an internal function that must not accessible to the user but should be accessible inside of fnbig().
# package code
fnsmall <- function()
{
bla bla..
}
#' #export
fnbig <- function()
{
bla bla..
x <- fnsmall()
bla..
}
I have tried exporting fnsmall(). All works but it litters the NAMESPACE. I tried not exporting fnsmall(), but then it doesn't work inside fnbig() when using x <- fnsmall() or x <- foo::fnsmall(). Then I tried to use x <- foo:::fnsmall(), and it works. But I read that using :::is not recommended.
What is the best way to go about doing this? How do I call an internal function from an exported function?
But I read that using :::is not recommended.
I think you mention this based on the following statement in the manual for writing packages by R.
Using foo:::f instead of foo::f allows access to unexported objects. This is generally not recommended, as the semantics of unexported objects may be changed by the package author in routine maintenance.
The reason for this to be not recommended is that unexported functions have no documentation and as such have no guarantee from the side of the package author that they will keep doing what they do now.
However, since you are referring to your own unexported functions, you have full control of what is happening in those functions, so this objection is not as relevant.
Referring to it as foo:::fnsmall is therefore a good option.

When I know the class of an R object, can I call the appropriate method directly?

Consider a simple R function:
#' #importFrom BarPackage Bar
Foo <- function (n, x) {
replicate(1e6, Bar(numeric(n), x))
}
If Bar() is a method, then calling Bar.numeric(numeric(n), x) saves R from having to look up the class of numeric(n); in a real-world example, this shaves about 10% off my run time.
However, Bar.numeric() is a function in a separate package (which I maintain). Roxygen creates export(Bar) and S3method(Bar, numeric) entries in NAMESPACE, but if I add importFrom BarPackage Bar to the FooPackage NAMESPACE and try to run Foo(), I see Warning message: 'Bar.numeric' is not exported by 'namespace:BarPackage' -- as indeed it's not.
It feels like I need to create a separate entry in my NAMESPACE reading export(Bar.numeric), but this feels like a very bad idea (in part because I don't think that Roxygen will do it). Is there any other way of avoiding R having to look up the class of numeric(n) on each call to Bar()?

Why/how some packages define their functions in nameless environment?

In my code, I needed to check which package the function is defined from (in my case it was exprs(): I needed it from Biobase but it turned out to be overriden by rlang).
From this SO question, I thought I could use simply environmentName(environment(functionname)). But for exprs from Biobase that expression returned empty string:
environmentName(environment(exprs))
# [1] ""
After checking the structure of environment(exprs) I noticed that it has .Generic member which contains package name as an attribute:
environment(exprs)$.Generic
# [1] "exprs"
# attr(,"package")
# [1] "Biobase"
So, for now I made this helper function:
pkgparent <- function(functionObj) {
functionEnv <- environment(functionObj)
envName <- environmentName(functionEnv)
if (envName!="")
return(envName) else
return(attr(functionEnv$.Generic,'package'))
}
It does the job and correctly returns package name for the function if it is loaded, for example:
pkgparent(exprs)
# Error in environment(functionObj) : object 'exprs' not found
library(Biobase)
pkgparent(exprs)
# [1] "Biobase"
library(rlang)
# The following object is masked from ‘package:Biobase’:
# exprs
pkgparent(exprs)
# [1] "rlang"
But I still would like to learn how does it happen that for some packages their functions are defined in "unnamed" environment while others will look like <environment: namespace:packagename>.
What you’re seeing here is part of how S4 method dispatch works. In fact, .Generic is part of the R method dispatch mechanism.
The rlang package is a red herring, by the way: the issue presents itself purely due to Biobase’s use of S4.
But more generally your resolution strategy might fail in other situations, because there are other reasons (albeit rarely) why packages might define functions inside a separate environment. The reason for this is generally to define a closure over some variable.
For example, it’s generally impossible to modify variables defined inside a package at the namespace level, because the namespace gets locked when loaded. There are multiple ways to work around this. A simple way, if a package needs a stateful function, is to define this function inside an environment. For example, you could define a counter function that increases its count on each invocation as follows:
counter = local({
current = 0L
function () {
current <<- current + 1L
current
}
})
local defines an environment in which the function is wrapped.
To cope with this kind of situation, what you should do instead is to iterate over parent environments until you find a namespace environment. But there’s a simpler solution, because R already provides a function to find a namespace environment for a given environment (by performing said iteration):
pkgparent = function (fun) {
nsenv = topenv(environment(fun))
environmentName(nsenv)
}

Where and how to define a generic function, if multiple packages are used

I know there are related posts, but with insufficient answers. So please answer seriously to this question.
There are two packages ("keithley" and "xantrex") which control two different hardware devices. Therefore, both are independent from each other. Each of them must be initialised separately. So I wrote two methods
init.keithley(inst,...) # in keythley package
and
init.xantrex(inst,...) # in xantrex package
for the generic S3 function init(inst,...). I tried to declare the generic function in the keithley package and in the xantrex package, but then it is masked, once the latter is loaded and the methods where not found any more.
What I tried is the .onAttach()-hook
.onAttach <- function(libname, pkgname)
{
if(!exists("init"))
eval(expression(init <- function(inst,...) UseMethod("init")),envir = .GlobalEnv)
}
But with this it is NOT possible to evaluate the init() function within the package namespace. This can be proofed with the option envir = environment(), which will not work. I also tried setGenericS3() and setGeneric() with always the same result.
The "dirty" solution could be to define a third package and import it, but there must be a clean way to do this.
Where and how should I define the generic function?
Here is the solution:
As I understand, an attached package has three environments (e.g. "package:Xantrex", "namespace:Xantrex" and "imports:Xantrex") the different meaning of these is explained in detail here: Advanced R.
Now, we have to test whether the generic function init() is already there and if not we have to initialize it in the right environment. The following code will do that for us.
.onAttach <- function(libname, pkgname)
{
if(!exists("init",mode = "function"))
eval(expression(init <- function(inst,...) UseMethod("init")),envir = as.environment("package:Xantrex"))
}
The .onAttach-hook, is necessary to guarantee that the different namespaces are initialized. In contrast to that the .onLoad-hook, would be too early. Mention that the expression is evaluated in the package:Xantrex environment, so the generic becomes visible in the search path.
Next to that take care, that your NAMESPACE file will export(init.xantrex) and NOT S3method(init,xantrex). The latter will result an error, because the generic for the method init.xantrex()is not present while building the package.
Best!
Martin

Search all existing functions for package dependencies?

I have a package that I wrote while learning R and its dependency list is quite long. I'm trying to trim it down, for two cases:
I switched to other approaches, and packages listed in Suggests simply aren't used at all.
Only one function out of my whole package relies on a given dependency, and I'd like to switch to an approach where it is loaded only when needed.
Is there an automated way to track down these two cases? I can think of two crude approaches (download the list of functions in all the dependent packages and automate a text search for them through my package's code, or load the package functions without loading the required packages and execute until there's an error), but neither seems particularly elegant or foolproof....
One way to check dependancies in all functions is to use the byte compiler because that will check for functions being available in the global workspace and issue a notice if it does not find said function.
So if you as an example use the na.locf function from the zoo package in any of your functions and then byte compile your function you will get a message like this:
Note: no visible global function definition for 'na.locf'
To correctly address it for byte compiling you would have to write it as zoo::na.locf
So a quick way to test all R functions in a library/package you could do something like this (assuming you didn't write the calls to other functions with the namespace):
Assuming your R files with the functions are in C:\SomeLibrary\ or subfolders there of and then you define a sourceing file as C:\SomeLibrary.r or similar containing:
if (!(as.numeric(R.Version()$major) >=2 && as.numeric(R.Version()$minor) >= 14.0)) {
stop("SomeLibrary needs version 2.14.0 or greater.")
}
if ("SomeLibrary" %in% search()) {
detach("SomeLibrary")
}
currentlyInWorkspace <- ls()
SomeLibrary <- new.env(parent=globalenv())
require("compiler",quietly=TRUE)
pathToLoad <- "C:/SomeLibraryFiles"
filesToSource <- file.path(pathToLoad,dir(pathToLoad,recursive=TRUE)[grepl(".*[\\.R|\\.r].*",dir(pathToLoad,recursive=TRUE))])
for (filename in filesToSource) {
tryCatch({
suppressWarnings(sys.source(filename, envir=SomeLibrary))
},error=function(ex) {
cat("Failed to source: ",filename,"\n")
print(ex)
})
}
for(SomeLibraryFunction in ls(SomeLibrary)) {
if (class(get(SomeLibraryFunction,envir=SomeLibrary))=="function") {
outText <- capture.output(with(SomeLibrary,assign(SomeLibraryFunction,cmpfun(get(SomeLibraryFunction)))))
if(length(outText)>0){
cat("The function ",SomeLibraryFunction," produced the following compile note(s):\n")
cat(outText,sep="\n")
cat("\n")
}
}
}
attach(SomeLibrary)
rm(list=ls()[!ls() %in% currentlyInWorkspace])
invisible(gc(verbose=FALSE,reset=TRUE))
Then start up R with no preloaded packages and source in C:\SomeLibrary.r
And then you should get notes from cmpfun for any call to a function in a package that's not part of the base packages and doesn't have a fully qualified namespace defined.

Resources