Fallback and optional dependencies in R packages the CRAN way? - r

I would like to add a fallback dependency to my package. Problem is I want to do it CRAN compliant and can't figure out how to do it properly.
More specificly, I want to use data.table's fread / fwrite. Other than that I do not want to have a full data.table dependency. If data.table is not installed, my package should just fall back to using standard read.csv and write.csv.
I've seen this similar thread: Proper way to handle optional package dependencies
and also used a technique similar to what #Hadley suggested in the comments:
req <- require(data.table)
if(req){
data.table::fwrite(...)
} else {
write.csv(...)
}
This does work, but when running CHECK I get a NOTE:
'library' or 'require' call to ‘data.table’ in package code.
Please use :: or requireNamespace() instead.
Which means I won't get it past CRAN's supervisors...
What's the correct way to handle this?

Just as the text says:
replace your (outdated) call to require() with one to requireNamespace()
Then, in the TRUE cases, call the package.
I often use :: to refer to the suggested package.
So mocking this up (and note, untested) I'd do
myreader <- function(file) {
if (requireNamespace("data.table", quietly=TRUE)) {
dat <- data.table::fread(file)
} else {
dat <- read.csv(file)
}
## postprocess dat as needed
dat
}
Searches at GitHub are useful with user:cran l=R yourTerm so try this one. I use the very idiom in a number of packages.

Related

"Found the following calls to data() loading into the global environment"

In my R-package, I have:
f <- function()
{
data('MyDataSet') # Load a dataset in my own package
... # Use MyDataSet to return something
}
The package builder has a warning message:
Found the following calls to data() loading into the global environment
What is the easiest way to fix the problem? Can I just load the data set into a variable? I don't need to save it to the global environment.
CHECK
I tested this and I could built the package without problems.
What gave me your note (not a warning, not a error) was actually performing the check() for the package.
PROBLEM
This is first of all important if you want to put the package on CRAN. Since the package will be very likely rejected if you do not have 0 notes, 0 warning, 0 errors.
If you just want to use the package for your own, you could also just leave it as it is. Since the check looks for coding guidelines and performs other useful things it indeed also may make sense to fix this for your private package.
FIX
One solution could be to include this dataset in your package itself.
You have to create a folder called data in your package to do this. Add the dataset as .rda file there. I think in your package description LazyData: TRUE has also to be set. (think this is the default).
Now you can write the following:
f <- function()
{
x <- MyPackageName::MyDataSet
... # Use MyDataSet to return something
}

Dealing with conflicting namespaces in R (same function names in different packages): reset precedence of a package namespace

The name conflicts between namespaces from different packages in R can be dangerous, and the use of package::function is unfortunately not generalized in R...
Isn't there a function that can reset the precedence of a package namespace over all the others currently loaded?
Surely we can detach and then reload the package, but isn't there any other, more practical (one-command) way?
Because I often end up with many packages and name conflicts in my R sessions, I use the following function to do that:
set_precedence <- function(pckg) {
pckg <- deparse(substitute(pckg))
detach(paste("package", pckg, sep = ":"), unload=TRUE, character.only=TRUE)
library(pckg, character.only=TRUE)
}
# Example
set_precedence(dplyr)
No built-in way to achieve this in a single command?
Or a way that doesn't imply detaching and reloading the package, in case it is heavy to load, and working directly on namespaces?
Here's a way that do not reload the package and work directly on the environments/namespaces.
Just replace your library() call with attachNamespace():
set_precedence <- function(pkg) {
detach(paste0("package:", pkg), character.only = TRUE)
attachNamespace(pkg)
}
set_precedence('utils')
# now utils is in pos #2 in `search()`
I would suggest taking a look at the conflicted package.
Prefix the package name with a double colon: <package>::<function>()
For instance:
ggplot2::ggplot(data=data, ggplot2::aes(x=x)) +
ggplot2::geom_histogram()
More typing, but I feel so much less anxious using R now that I have found this.

Automatically install list of packages in R if necessary

I would like to check, at the beginning of my R script, whether the required packages are installed and, if not, install them.
I would like to use something like the following:
RequiredPackages <- c("stockPortfolio","quadprog")
for (i in RequiredPackages) { #Installs packages if not yet installed
if (!require(i)) install.packages(i)
}
However, this gives me error messages because R tries to install a package named 'i'. If instead I use...
if (!require(i)) install.packages(get(i))
...in the relevant line, I still get error messages.
Anybody know how to solve this?
Although the problem has been solved by #Thomas's answer, I would like to point out that pacman might be a better yet simple choice:
First install pacman:
install.packages("pacman")
Then load packages. Pacman will check whether each package has been installed, and if not, will install it automatically.
pacman::p_load("stockPortfolio","quadprog")
That's it.
Relevant links:
pacman GitHub page
Introduction to pacman
I think this is close to what you want:
requiredPackages <- c("stockPortfolio","quadprog")
for (package in requiredPackages) { #Installs packages if not yet installed
if (!requireNamespace(package, quietly = TRUE))
install.packages(package)
}
HERE is the source code and an explanation of the requireNamespace function.
Both library and require use non-standard evaluation on their first argument by default. This makes them hard to use in programming. However, they both take a character.only argument (Default is FALSE), which you can use to achieve your result:
RequiredPackages <- c("stockPortfolio","quadprog")
for (i in RequiredPackages) { #Installs packages if not yet installed
if (!require(i, character.only = TRUE)) install.packages(i)
}
I have by now written the following function (and put it into a package), which essentially does what #Thomas and #federico propose:
SPLoadPackages<-function(packages){
for(fP in packages){
eval(parse(text="if(!require("%_%fP%_%")) install.packages('"%_%fP%_%"')"))
eval(parse(text="library("%_%fP%_%")"))
}
}

R Package development: overriding a function from one package with a function from another?

I am currently working on developing two packages, below is a simplified
version of my problem:
In package A I have some functions (say "sum_twice"), and I it calls to
another function inside the package (say "slow_sum").
However, in package B, I wrote another function (say "fast_sum"), with
which I wish to replace the slow function in package A.
Now, how do I manage this "overriding" of the "slow_sum" function with the
"fast_sum" function?
Here is a simplified example of such functions (just to illustrate):
############################
##############
# Functions in package A
slow_sum <- function(x) {
sum_x <- 0
for(i in seq_along(x)) sum_x <- sum_x + x[i]
sum_x
}
sum_twice <- function(x) {
x2 <- rep(x,2)
slow_sum(x2)
}
##############
# A function in package B
fast_sum <- function(x) { sum(x) }
############################
If I only do something like slow_sum <- fast_sum, this would not work, since "sum_twice" uses "slow_sum" from the NAMESPACE
of package A.
I tried using the following function when loading package "B":
assignInNamespace(x = "slow_sum", value = B:::fast_sum, ns = "A")
This indeed works, however, it makes the CRAN checks return both a NOTE on
how I should not use ":::", and also a warning for using assignInNamespace
(since it is supposed to not be very safe).
However, I am at a loss.
What would be a way to have "sum_twice" use "fast_sum" instead of
"slow_sum"?
Thank you upfront for any feedback or suggestion,
With regards,
Tal
p.s: this is a double post from here.
UDPATE: motivation for this question
I am developing two packages, one is based solely on R and works fine (but a bit slow), it is dendextend (which is now on CRAN). The other one is meant to speed up the first package by using Rcpp (this is dendextendRcpp which is on github). The second package speeds up the first by overriding some basic functions the first package uses. But in order for the higher levels functions in the first package will use the lower functions in the second package, I have to use assignInNamespace which leads CRAN to throw warnings+NOTES, which ended up having the package rejected from CRAN (until these warnings will be avoided).
The problem is that I have no idea how to approach this issue. The only solution I can think of is either mixing the two packages together (making it harder to maintain, and will automatically require a larger dependency structure for people asking to use the package). And the other option is to just copy paste the higher level functions from dendextend to dendextendRcpp, and thus have them mask the other functions. But I find this to be MUCH less elegant (because that means I will need to copy-paste MANY functions, forcing more double-code maintenance) . Any other ideas? Thanks.
We could put this in sum_twice:
my_sum_ch <- getOption("my_sum", if ("package:fastpkg" %in% search())
"fast_sum" else "slow_sum")
my_sum <- match.fun(my_sum_ch)
If the "my_sum" option were set then that version of my_sum would be used and if not it would make the decision based on whether or not fastpkg had been loaded.
The solution I ended up using (thanks to Uwe and Kurt), is using "local" to create a localized environment with the package options. If you're curious, the function is called "dendextend_options", and is here:
https://github.com/talgalili/dendextend/blob/master/R/zzz.r
Here is an example for its use:
dendextend_options <- local({
options <- list()
function(option, value) {
# ellipsis <- list(...)
if(missing(option)) return(options)
if(missing(value))
options[[option]]
else options[[option]] <<- value
}
})
dendextend_options("a")
dendextend_options("a", 1)
dendextend_options("a")
dendextend_options("a", NULL)
dendextend_options("a")
dendextend_options()

Search all existing functions for package dependencies?

I have a package that I wrote while learning R and its dependency list is quite long. I'm trying to trim it down, for two cases:
I switched to other approaches, and packages listed in Suggests simply aren't used at all.
Only one function out of my whole package relies on a given dependency, and I'd like to switch to an approach where it is loaded only when needed.
Is there an automated way to track down these two cases? I can think of two crude approaches (download the list of functions in all the dependent packages and automate a text search for them through my package's code, or load the package functions without loading the required packages and execute until there's an error), but neither seems particularly elegant or foolproof....
One way to check dependancies in all functions is to use the byte compiler because that will check for functions being available in the global workspace and issue a notice if it does not find said function.
So if you as an example use the na.locf function from the zoo package in any of your functions and then byte compile your function you will get a message like this:
Note: no visible global function definition for 'na.locf'
To correctly address it for byte compiling you would have to write it as zoo::na.locf
So a quick way to test all R functions in a library/package you could do something like this (assuming you didn't write the calls to other functions with the namespace):
Assuming your R files with the functions are in C:\SomeLibrary\ or subfolders there of and then you define a sourceing file as C:\SomeLibrary.r or similar containing:
if (!(as.numeric(R.Version()$major) >=2 && as.numeric(R.Version()$minor) >= 14.0)) {
stop("SomeLibrary needs version 2.14.0 or greater.")
}
if ("SomeLibrary" %in% search()) {
detach("SomeLibrary")
}
currentlyInWorkspace <- ls()
SomeLibrary <- new.env(parent=globalenv())
require("compiler",quietly=TRUE)
pathToLoad <- "C:/SomeLibraryFiles"
filesToSource <- file.path(pathToLoad,dir(pathToLoad,recursive=TRUE)[grepl(".*[\\.R|\\.r].*",dir(pathToLoad,recursive=TRUE))])
for (filename in filesToSource) {
tryCatch({
suppressWarnings(sys.source(filename, envir=SomeLibrary))
},error=function(ex) {
cat("Failed to source: ",filename,"\n")
print(ex)
})
}
for(SomeLibraryFunction in ls(SomeLibrary)) {
if (class(get(SomeLibraryFunction,envir=SomeLibrary))=="function") {
outText <- capture.output(with(SomeLibrary,assign(SomeLibraryFunction,cmpfun(get(SomeLibraryFunction)))))
if(length(outText)>0){
cat("The function ",SomeLibraryFunction," produced the following compile note(s):\n")
cat(outText,sep="\n")
cat("\n")
}
}
}
attach(SomeLibrary)
rm(list=ls()[!ls() %in% currentlyInWorkspace])
invisible(gc(verbose=FALSE,reset=TRUE))
Then start up R with no preloaded packages and source in C:\SomeLibrary.r
And then you should get notes from cmpfun for any call to a function in a package that's not part of the base packages and doesn't have a fully qualified namespace defined.

Resources