Determining which version of a function is active when many packages are loaded - r

If I have multiple packages loaded that define functions of the same name, is there an easy way to determine which version of the function is currently the active one? Like, lets say I have base R, the tidyverse, and a bunch of time series packages loaded. I'd like a function which_package("intersect") that would tell me the package name of the active version of the intersect function. I know you can go back and look at all the warning messages you recieved when installing packages, but I think that sort of manual search is not only tedious but also error-prone.
There is a function here that does sort of what I want, except it produces a table for all conflicts rather than the value for one function. I would actually be quite happy with that, and would also accept a similar function as an answer, but I have had problems with the implimentation of function given. As applied to my examples, it inserts vast amounts of white space and many duplicates of the package names (e.g. the %>% function shows up with 132 packages listed), making the output hard to read and hard to use. It seems like it should be easy to remove the white space and duplicates, and I have spent considerable time on various approaches that I expected to work but which had no impact on the outcome.
So, for an example of many conflicts:
install.packages(pkg = c("tidyverse", "fpp3", "tsbox", "rugarch", "Quandl", "DREGAR", "dynlm", "zoo", "GGally", "dyn", "ARDL", "bigtime", "BigVAR", "dLagM", "VARshrink")
lapply(x = c("tidyverse", "fable", "tsbox", "rugarch", "Quandl", "DREGAR", "dynlm", "zoo", "GGally", "dyn", "ARDL", "bigtime", "BigVAR", "dLagM", "VARshrink"),
library, character.only = TRUE)

You can pull this information with your own function helper.
which_package <- function(fun) {
if(is.character(fun)) fun <- getFunction(fun)
stopifnot(is.function(fun))
x <- environmentName(environment(fun))
if (!is.null(x)) return(x)
}
This will return R_GlobalEnv for functions that you define in the global environment. There is also the packageName function if you really want to restrict it to packages only.
For example
library(MASS)
library(dplyr)
which_package(select)
# [1] "dplyr"

Related

R:"Note: no visible binding for global variable" after devtools:use_data

I want to make a dataframe available to users of a package, as well as to functions of the package.
Starting from a new package I've used
devtools::use_data_raw("my_data")
This creates the file data_raw/my_data.R, which I edit as
my_data <- data.frame(x = runif(3), y = runif(3))
devtools::use_data(my_data, overwrite = TRUE)
After running the code above, a file data/my_data.Rda is created.
According to Hadley Wickham's R Packages every file under data/ is exported, and if I try
load_all()
my_data
I can see that this is the case. However if I now try to use the my_data dataframe inside a function in the package, say like R/test_my_data.R
test_my_data <- function {
my_data
}
and then I run
devtools::check()
I get the infamous no visible binding for global variable my_data NOTE.
I appreciate there are many questions already on this topic but many are related to the cases where a tidy evaluation function is used, or data from another package is referred to. Why is R CMD check failing on the example above, and what's the correct way of sorting this out?
I'm aware that the
utils::globalVariables("my_data")
solution will avoid this NOTE, but I'd like to know if there's a proper way of informing R CMD check that my_data actually exists.

Removing/de-registering a specific function from an R package

I may not be using the terminology correctly here so please forgive me...
I have a case of one package 'overwriting' a function with the same name loaded by another package and thus changing the behavior (breaking) of a function.
The specific case:
X <- data.frame ( y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100) )
library(CausalImpact)
a <- CausalImpact::CausalImpact( X, c(1,75), c(76, 100) ) # works
library(bfast) # imports quantmod which loads crappy version of as.zoo.data.frame
b <- CausalImpact::CausalImpact( X, c(1,75), c(76, 100) ) # Error
I know the error comes from two versions of the function as.zoo.data.frame.
The problematic version is imported by bfast from the package 'quantmod' (see https://github.com/joshuaulrich/quantmod/issues/168). Unfortunately their hotfix did not prevent this error. Super annoying.
I can hack around this specific problem, but I was wondering if there is a general way to like 'de-register' this function variant from the search path. Neither detach nor unloadNamespace remove the offending function (same behavior after). An explanation and similar problem is discussed here and here, but I wasn't able to find a general solution. For instance I'd rather just remove this function than clone and re-write CausalImpact to deal with this behavior.
From R 3.6.0 onwards, there is a new option called "conflicts.policy" to handle this within an established framework. For small issues like this, you can use the new arguments to library(). If you aren't yet to 3.6, the easiest solution might be to explicitly namespace CausalImpact when you need it, i.e. CausalImpact::CausalImpact. That's a mouthful, so you could do causal_impact <- CausalImpact::CausalImpact and use that alias.
# only attach select
library(dplyr, include.only = "select")
# exclude slice/arrange from being attached.
library(dplyr, exclude = c("slice", "arrange"))
library(bfast, exclude = "CausalImpact") should solve your problem.
Attach means that they are available for use without explicit prefixing with their package. In either of these cases, something like dplyr::slice would work just fine.
For more information, you can see ?library. Also, the R-Core member Luke Tierney wrote a blog explaining how the conflicts.policy works. You can find that here
Here's an answer that works, but is less preferable than de-registering a S3 method because it involves replacing the registered version in the S3 Methods table with the desired method:
library(CausalImpact)
library(bfast)
assignInNamespace("as.zoo.data.frame", zoo:::as.zoo.data.frame, ns = asNamespace("zoo"))
based partially on #smingerson's suggestion in the comments

Choose function to load from an R package

I like using function reshape from the matlab package, but I need then to specify base::sum(m) each time I want to sum the elements of my matrix or else matlab::sum is called, which only sums by columns..
I need loading package gtools to use the rdirichlet function, but then the function gtools::logit masks the function pracma::logit that I like better..
I gess there are no such things like:
library(loadOnly = "rdirichlet", from = "gtools")
or
library(loadEverythingFrom = "matlab", except = "sum")
.. because functions from the package matlab may internaly work on the matlab::sum function. So the latter must be loaded. But is there no way to get this behavior from the point of view of the user? Something that would feel like:
library(pracma)
library(matlab)
library(gtools)
sum <- base::sum
logit <- pracma::logit
.. but that would not spoil your ls() with all these small utilitary functions?
Maybe I need defining my own default namespace?
To avoid spoiling your ls, you can do something like this:
.ns <- new.env()
.ns$sum <- base::sum
.ns$logit <- pracma::logit
attach(.ns)
To my knowledge there is no easy answer to what you want to achieve. The only dirty hack I can think of is to download the source of the packages "matlab", "gtools", "pracma" and delete the offending functions from their NAMESPACE file prior to installation from source (with R CMD INSTALL package).
However, I would recommend using the explicit notation pracma::logit, because it improves readability of your code for other people and yourself in the future.
This site gives a good overview about package namespaces:
http://r-pkgs.had.co.nz/namespace.html

R Package development: overriding a function from one package with a function from another?

I am currently working on developing two packages, below is a simplified
version of my problem:
In package A I have some functions (say "sum_twice"), and I it calls to
another function inside the package (say "slow_sum").
However, in package B, I wrote another function (say "fast_sum"), with
which I wish to replace the slow function in package A.
Now, how do I manage this "overriding" of the "slow_sum" function with the
"fast_sum" function?
Here is a simplified example of such functions (just to illustrate):
############################
##############
# Functions in package A
slow_sum <- function(x) {
sum_x <- 0
for(i in seq_along(x)) sum_x <- sum_x + x[i]
sum_x
}
sum_twice <- function(x) {
x2 <- rep(x,2)
slow_sum(x2)
}
##############
# A function in package B
fast_sum <- function(x) { sum(x) }
############################
If I only do something like slow_sum <- fast_sum, this would not work, since "sum_twice" uses "slow_sum" from the NAMESPACE
of package A.
I tried using the following function when loading package "B":
assignInNamespace(x = "slow_sum", value = B:::fast_sum, ns = "A")
This indeed works, however, it makes the CRAN checks return both a NOTE on
how I should not use ":::", and also a warning for using assignInNamespace
(since it is supposed to not be very safe).
However, I am at a loss.
What would be a way to have "sum_twice" use "fast_sum" instead of
"slow_sum"?
Thank you upfront for any feedback or suggestion,
With regards,
Tal
p.s: this is a double post from here.
UDPATE: motivation for this question
I am developing two packages, one is based solely on R and works fine (but a bit slow), it is dendextend (which is now on CRAN). The other one is meant to speed up the first package by using Rcpp (this is dendextendRcpp which is on github). The second package speeds up the first by overriding some basic functions the first package uses. But in order for the higher levels functions in the first package will use the lower functions in the second package, I have to use assignInNamespace which leads CRAN to throw warnings+NOTES, which ended up having the package rejected from CRAN (until these warnings will be avoided).
The problem is that I have no idea how to approach this issue. The only solution I can think of is either mixing the two packages together (making it harder to maintain, and will automatically require a larger dependency structure for people asking to use the package). And the other option is to just copy paste the higher level functions from dendextend to dendextendRcpp, and thus have them mask the other functions. But I find this to be MUCH less elegant (because that means I will need to copy-paste MANY functions, forcing more double-code maintenance) . Any other ideas? Thanks.
We could put this in sum_twice:
my_sum_ch <- getOption("my_sum", if ("package:fastpkg" %in% search())
"fast_sum" else "slow_sum")
my_sum <- match.fun(my_sum_ch)
If the "my_sum" option were set then that version of my_sum would be used and if not it would make the decision based on whether or not fastpkg had been loaded.
The solution I ended up using (thanks to Uwe and Kurt), is using "local" to create a localized environment with the package options. If you're curious, the function is called "dendextend_options", and is here:
https://github.com/talgalili/dendextend/blob/master/R/zzz.r
Here is an example for its use:
dendextend_options <- local({
options <- list()
function(option, value) {
# ellipsis <- list(...)
if(missing(option)) return(options)
if(missing(value))
options[[option]]
else options[[option]] <<- value
}
})
dendextend_options("a")
dendextend_options("a", 1)
dendextend_options("a")
dendextend_options("a", NULL)
dendextend_options("a")
dendextend_options()

Save package settings between sessions

Is there a definitive way to save options or information pertaining to a certain package between sessions?
For example say somebody made a game and released it as an R package. If they wanted to save high scores and not have them reset each time R started a new session what would be the best way to do this? Currently I can only think of storing a file in the users home directory but I'm not sure if I like that approach.
This may be an approach. I created a dummy package with a dummy function (any function I create is bound to be a dummy function) and a data set I called scores that I set as follows:
scores <- NA
Then I created the package with the scores data set.
Then I used the following to change the data set from within R.
loc <- paste0(find.package("new"), "/Data")
unlink(paste0(loc, "/scores.rda"), recursive = TRUE, force = FALSE)
scores <- 10
save(scores, file=paste0(loc, "/scores.rda"))
Then when I unloaded the library and re loaded agin the data set now says:
> scores
[1] 10
Could this be modified to do what you want? You'd have to have it save in between somehow but am not sure on how to do this without messing with .Last function.
EDIT:
It appears this option is not viable in that when you compile as a package and use lazy load it saves the data sets as:
RData.rbd, RData.rbx, not as .rda files. That means the approach I use above is kinda worthless in that we want it to automatically be recognized.
EDIT2
This approach works and I tried it on a package I made. You can't do lazy load of the data and you have to either explicitly use data(scores) or use data(scores) inside of the function you're calling. I also assigned scores to .scores int he global.env the first time it was created and used exists inside the function to see if it exists. If `.scores. existed I assigned that to scores within the function. Once you unload the library and laod again you never have to worry about that again.
Maybe an alternative is to save this as a function somehow that can be altered using Josh's advice here: Permanently replacing a function
I guess there is no way to store settings without saving them to disk or a database, some way or another. It can be done silently though by putting the code below in your ~/.Rprofile. However, if you have packages that save settings in other ways than using options you need to add them manually.
I know this is exactly what you said you did not want, but it might spark some debate at least.
.Last <- function(){
my.options <- options()
save(my.options, file="~/.Roptions.Rdata")
}
.First <- function(){
tryCatch({
load("~/.Roptions.Rdata")
do.call(options, my.options)
rm(my.options)
}, error=function(...){})
}
To my suprise try(..., silent=TRUE) gives a warning on startup if ~/.Roptions.Rdata does not exist, which is why I used tryCatch instead.
The modern answer to this problem is well explained at https://blog.r-hub.io/2020/03/12/user-preferences/
I think I will be trying the hoardr package! Here is an example that worked for me :)
x <- hoardr::hoard()
x$cache_path_set("yourpackage", type = 'user_cache_dir')
x$mkdir()
scores<-data.frame(
user=c("one","two","three"),
score=c("500,200,1100")
)
save(scores,file = file.path(x$cache_path_get(), "scores.rdata"))
x$list()
x$details()
#new session
x <- hoardr::hoard()
x$cache_path_set("yourpackage", type = 'user_cache_dir')
x$list()
x$details()
load(file = file.path(x$cache_path_get(), "scores.rdata"))
PS - you can see a working example in the rnoaa package found on at github "opensci/rnoaa". Check their R/onload.r file! I can expand if needed.

Resources