Editing a function in R using trace? - r

I noticed there is a bug in a function from a package I want to use. An issue has been made on GitHub, but the creator hasn't adressed this yet, and I need the function as soon as possible.
Therefore I want to edit the code. Apparently this is possible by editing the source, repacking and installing the entire package, I can rewrite the function and reassign the namespace, but also possibly by just editing the function in the current session using trace().
I already found out I can do:
as.list(body(package:::function_inside_function))
The line I want to edit is located in the second step of the function.
Specifically, it is this line in the code I need to edit. I have to change ignore.case to ignore.case=TRUE. An example in case the link dies:
functionx(){if{...} else if(grepl("miRNA", data.type, ignore.case)) {...}}
I haven't really found a practical example on how to proceed from here, so can anyone show me an example of how to do this, or lead me to a practical example of using trace? Or perhaps reassigning the function to the namespace?

For your specific case, you could probably indeed work around it using trace.
From the link you provide I don't know why you speak of a function inside a function, but this should work:
# example
trace("grepl", tracer = quote(ignore.case <- TRUE))
grepl("hi", "Hi")
## Tracing grepl("hi", "Hi") on entry
## [1] TRUE
# your case (I assume)
trace("readTranscriptomeProfiling", tracer = quote(ignore.case <- TRUE))
Note that this would be more complicated if the argument ignore.case that you want to fix wasn't already at the right position in the call.

I faced a similar problem once and solved it using assignInNamespace(). I don't have your package installed so I cannot be sure this will work for you, but I think it should. You would proceed as follows:
Make the version of the function you want, as edited:
# I would just copy the function off github and change the offending line
readTranscripttomeProfiling <- function() {"Insert code here"}
# Get the problematic version of the function out of the package namespace
tmpfun <- get("readTranscripttomeProfiling",
envir = asNamespace("TCGAbiolinks"))
# Make sure the new function has the environment of the old
# function (there are possibly easier ways to do this -- I like
# to get the old function out of the namespace to be sure I can do
# it and am accessing what I want to access)
environment(readTranscripttomeProfiling) <- environment(tmpfun)
# Replace the old version of the function in the package namespace
# with your new version
assignInNamespace("readTranscripttomeProfiling",
readTranscripttomeProfiling, ns = "TCGAbiolinks")
I found this solution in another StackOverflow response, but cannot seem to find the original at the moment.

Related

R generic dispatching to attached environment

I have a bunch of functions and I'm trying to keep my workspace clean by defining them in an environment and attaching the environment. Some of the functions are S3 generics, and they don't seem to play well with this approach.
A minimum example of what I'm experiencing requires 4 files:
testfun.R
ttt.xxx <- function(object) print("x")
ttt <- function(object) UseMethod("ttt")
ttt2 <- function() {
yyy <- structure(1, class="xxx")
ttt(yyy)
}
In testfun.R I define an S3 generic ttt and a method ttt.xxx, I also define a function ttt2 calling the generic.
testenv.R
test_env <- new.env(parent=globalenv())
source("testfun.R", local=test_env)
attach(test_env)
In testenv.R I source testfun.R to an environment, which I attach.
test1.R
source("testfun.R")
ttt2()
xxx <- structure(1, class="xxx")
ttt(xxx)
test1.R sources testfun.R to the global environment. Both ttt2 and a direct function call work.
test2.R
source("testenv.R")
ttt2()
xxx <- structure(1, class="xxx")
ttt(xxx)
test2.R uses the "attach" approach. ttt2 still works (and prints "x" to the console), but the direct function call fails:
Error in UseMethod("ttt") :
no applicable method for 'ttt' applied to an object of class "xxx"
however, calling ttt and ttt.xxx without arguments show that they are known, ls(pos=2) shows they are on the search path, and sloop::s3_dispatch(ttt(xxx)) tells me it should work.
This questions is related to Confusion about UseMethod search mechanism and the link therein https://blog.thatbuthow.com/how-r-searches-and-finds-stuff/, but I cannot get my head around what is going on: why is it not working and how can I get this to work.
I've tried both R Studio and R in the shell.
UPDATE:
Based on the answers below I changed my testenv.R to:
test_env <- new.env(parent=globalenv())
source("testfun.R", local=test_env)
attach(test_env)
if (is.null(.__S3MethodsTable__.))
.__S3MethodsTable__. <- new.env(parent = baseenv())
for (func in grep(".", ls(envir = test_env), fixed = TRUE, value = TRUE))
.__S3MethodsTable__.[[func]] <- test_env[[func]]
rm(test_env, func)
... and this works (I am only using "." as an S3 dispatching separator).
It’s a little-known fact that you must use .S3method() to define methods for S3 generics inside custom environments (outside of packages).1 The reason almost nobody knows this is because it is not necessary in the global environment; but it is necessary everywhere else since R version 3.6.
There’s virtually no documentation of this change, just a technical blog post by Kurt Hornik about some of the background. Note that the blog post says the change was made in R 3.5.0; however, the actual effect you are observing — that S3 methods are no longer searched in attached environments — only started happening with R 3.6.0; before that, it was somehow not active yet.
… except just using .S3method will not fix your code, since your calling environment is the global environment. I do not understand the precise reason why this doesn’t work, and I suspect it’s due to a subtle bug in R’s S3 method lookup. In fact, using getS3method('ttt', 'xxx') does work, even though that should have the same behaviour as actual S3 method lookup.
I have found that the only way to make this work is to add the following to testenv.R:
if (is.null(.__S3MethodsTable__.)) {
.__S3MethodsTable__. <- new.env(parent = baseenv())
}
.__S3MethodsTable__.$ttt.xxx <- ttt.xxx
… in other words: supply .GlobalEnv manually with an S3 methods lookup table. Unfortunately this relies on an undocumented S3 implementation detail that might theoretically change in the future.
Alternatively, it “just works” if you use ‘box’ modules instead of source. That is, you can replace the entirety of your testenv.R by the following:
box::use(./testfun[...])
This code treats testfun.R as a local module and loads it, attaching all exported names (via the attach declaration [...]).
1 (and inside packages you need to use the equivalent S3method namespace declaration, though if you’re using ‘roxygen2’ then that’s taken care of for you)
First of all, my advice would be: don't try to reinvent R packages. They solve all the problems you say you are trying to solve, and others as well.
Secondly, I'll try to explain what went wrong in test2.R. It calls ttt on an xxx object, and ttt.xxx is on the search list, but is not found.
The problem is how the search for ttt.xxx happens. The search doesn't look for ttt.xxx in the search list, it looks for it in the environment from which ttt was called, then in an object called .__S3MethodsTable__.. I think there are two reasons for this:
First, it's a lot faster. It only needs to look in one or two places, and the table can be updated whenever a package is attached or detached, a relatively rare operation.
Second, it's more reliable. Each package has its own methods table, because two packages can use the same name for generics that have nothing to do with each other, or can use the same class names that are unrelated. So package code needs to be able to count on finding its own definitions first.
Since your call to ttt() happens at the top level, that's where R looks first for ttt.xxx(), but it's not there. Then it looks in the global .__S3MethodsTable__. (which is actually in the base environment), and it's not there either. So it fails.
There is a workaround that will make your code work. If you run
.__S3MethodsTable__. <- list2env(list(ttt.xxx = ttt.xxx))
as the last line of testenv.R, then you'll create a methods table in the global environment. (Normally there isn't one there, because that's user space, and R doesn't like putting things there unless the user asks for it.)
R will find that methods table, and will find the ttt.xxx method that it defines. I wouldn't be surprised if this breaks some other aspect of S3 dispatch, so I don't recommend doing it, but give it a try if you insist on reinventing the package system.

Removing/de-registering a specific function from an R package

I may not be using the terminology correctly here so please forgive me...
I have a case of one package 'overwriting' a function with the same name loaded by another package and thus changing the behavior (breaking) of a function.
The specific case:
X <- data.frame ( y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100) )
library(CausalImpact)
a <- CausalImpact::CausalImpact( X, c(1,75), c(76, 100) ) # works
library(bfast) # imports quantmod which loads crappy version of as.zoo.data.frame
b <- CausalImpact::CausalImpact( X, c(1,75), c(76, 100) ) # Error
I know the error comes from two versions of the function as.zoo.data.frame.
The problematic version is imported by bfast from the package 'quantmod' (see https://github.com/joshuaulrich/quantmod/issues/168). Unfortunately their hotfix did not prevent this error. Super annoying.
I can hack around this specific problem, but I was wondering if there is a general way to like 'de-register' this function variant from the search path. Neither detach nor unloadNamespace remove the offending function (same behavior after). An explanation and similar problem is discussed here and here, but I wasn't able to find a general solution. For instance I'd rather just remove this function than clone and re-write CausalImpact to deal with this behavior.
From R 3.6.0 onwards, there is a new option called "conflicts.policy" to handle this within an established framework. For small issues like this, you can use the new arguments to library(). If you aren't yet to 3.6, the easiest solution might be to explicitly namespace CausalImpact when you need it, i.e. CausalImpact::CausalImpact. That's a mouthful, so you could do causal_impact <- CausalImpact::CausalImpact and use that alias.
# only attach select
library(dplyr, include.only = "select")
# exclude slice/arrange from being attached.
library(dplyr, exclude = c("slice", "arrange"))
library(bfast, exclude = "CausalImpact") should solve your problem.
Attach means that they are available for use without explicit prefixing with their package. In either of these cases, something like dplyr::slice would work just fine.
For more information, you can see ?library. Also, the R-Core member Luke Tierney wrote a blog explaining how the conflicts.policy works. You can find that here
Here's an answer that works, but is less preferable than de-registering a S3 method because it involves replacing the registered version in the S3 Methods table with the desired method:
library(CausalImpact)
library(bfast)
assignInNamespace("as.zoo.data.frame", zoo:::as.zoo.data.frame, ns = asNamespace("zoo"))
based partially on #smingerson's suggestion in the comments

Changing defaults in a function inside a locked package [duplicate]

This question already has answers here:
Setting Function Defaults R on a Project Specific Basis
(2 answers)
Closed 9 years ago.
I am developing my first package and it is aimed at users who are new to R, so I am trying to minimize the amount of R skills required to use the package. As a result I want a function that changes defaults in other functions within my package. But I get the following error "cannot add bindings to a locked environment", which means the environment of the package is locked and I am not allowed to change the default values of its functions.
Here is an example that throws a similar error:
library(ggplot2)
assign(formals(geom_point)$position, "somethingelse", pos="package:ggplot2")
When I try assignInNamespace i get:
Error in bindingIsLocked(x, ns) : no binding for "identity"
assignInNamespace(formals(geom_point)$position,"somethingelse", pos = "package:ggplot2")
Here is an example of what I hope to achieve.
default <- function(x=c("A", "B", "C")){
x
}
default()
change.default <- function(x){
formals(default)$x <<- x # Notice the global assign
}
change.default(1:3)
default()
I am aware that this is far from the recommended approach, but I am willing to cut corners to improve the learning curve of the package. Is there a way to achieve this?
This question has been marked as a duplicate of Setting Function Defaults R on a Project Specific Basis. This is a different situation as this question concerns how to allow the user in a interactive session to change the defaults of a function - not how to actually do it. The old question could not have been solved with the options() function and it is therefore a different question.
I think the colloquial way to achieve what you want is via option and packages in fact do so, e.g., lattice (although they use special options) or ascii.
Furthermore, this is also done so in base R, e.g., the famous and notorious default for stringsAsFactors.
If you look at ?read.table or ?data.frame you get: stringsAsFactors = default.stringsAsFactors(). Inspecting this reveals:
> default.stringsAsFactors
function ()
{
val <- getOption("stringsAsFactors")
if (is.null(val))
val <- TRUE
if (!is.logical(val) || is.na(val) || length(val) != 1L)
stop("options(\"stringsAsFactors\") not set to TRUE or FALSE")
val
}
<bytecode: 0x000000000b068478>
<environment: namespace:base>
The relevant part here is getOption("stringsAsFactors") which produces:
> getOption("stringsAsFactors")
[1] TRUE
Changing it is achieved like this:
> options(stringsAsFactors = FALSE)
> getOption("stringsAsFactors")
[1] FALSE
To do what you want your package would need to set an option, and the function take it's values form the options. Another function could then change the options:
options(foo=c("A", "B", "C"))
default <- function(x=getOption("foo")){
x
}
default()
change.default <- function(x){
options(foo=x)
}
change.default(1:3)
default()
If you want your package to set the options when loaded, you need to create a .onAttach or .onLoad function in zzz.R. My afex package e.g., does this and changes the default contrasts. In your case it could look like the following:
.onAttach <- function(libname, pkgname) {
options(foo=c("A", "B", "C"))
}
ascii does it via .onLoad (I don't remember what is the exact difference, but Writing R Extensions will help).
Preferably, a function has the following things:
Input arguments
A function body which does something with those arguments
Output arguments
So in your situation where you want to change something about the behavior of a function, changing the input arguments in the best way to go. See for example my answer to another post.
You could also use an option to save some global settings (e.g. which font to use, which PATH the packages you use are stored), see the answer of #James in the question I linked above. But use these things sparingly as it makes the code hard to read. I would primarily use them read only, i.e. set them once (either by the package or the user) and not allow functions to change them.
The unreadability stems from the fact that the behavior of the function is not solely determined locally (i.e. by the code directly working with it), but also by settings far away. This makes it hard to determine what a function does by purely looking at the code calling it, but you have to dig through much more code to fully understand what is going on. In addition, what if other functions change those options, making it even harder to predict what a given function will do as it depends on the history of functions. And here comes my earlier recommendation for read-only options back into play, if these are read only, some of the problems about readability are lessened.

Access (exporting) function from namespace

I know something similar has been asked before here on SO, but the solution given there doesn't seem to apply in my case.
I'm trying to follow convention in creating a package by referring to functions exported from other namespaces and avoiding use of require() within a function.
I'm basically trying to prevent a function taking too long to run. For example,
fun <- function(i){
require(R.utils)
setTimeLimit(elapsed=10, transient=TRUE) # prevent taking more than 10secs
return(i^i)
}
>fun(10)
Works fine, but if I try:
require(R.utils)
fun <- function(i){
R.utils:::setTimeLimit(elapsed=10, transient=TRUE) # prevent taking more than 10secs
return(i^i)
}
>fun(10)
I get:
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
object 'setTimeLimit' not found
Changing ::: to :: doesn't change this behavior.
I'm open to any simpler methods to achieving the same objective.
Also is it really so bad to have require() calls inside a function?
Many thanks!
EDIT:
If import works then great, thanks. Still in development so wanted to make sure it would be OK.
EDIT:
Apologies, it's there in base. Not sure how I missed this; I was originally using R.utils::evalWithTimeout and must have assumed both were in the same package. *looks sheepish*
I'm just posting this to prevent the question from showing up as unanswered, but will be glad to accept another...
isTRUE("setTimeLimit" %in% ls(getNamespace("base"), all.names=TRUE))

Save package settings between sessions

Is there a definitive way to save options or information pertaining to a certain package between sessions?
For example say somebody made a game and released it as an R package. If they wanted to save high scores and not have them reset each time R started a new session what would be the best way to do this? Currently I can only think of storing a file in the users home directory but I'm not sure if I like that approach.
This may be an approach. I created a dummy package with a dummy function (any function I create is bound to be a dummy function) and a data set I called scores that I set as follows:
scores <- NA
Then I created the package with the scores data set.
Then I used the following to change the data set from within R.
loc <- paste0(find.package("new"), "/Data")
unlink(paste0(loc, "/scores.rda"), recursive = TRUE, force = FALSE)
scores <- 10
save(scores, file=paste0(loc, "/scores.rda"))
Then when I unloaded the library and re loaded agin the data set now says:
> scores
[1] 10
Could this be modified to do what you want? You'd have to have it save in between somehow but am not sure on how to do this without messing with .Last function.
EDIT:
It appears this option is not viable in that when you compile as a package and use lazy load it saves the data sets as:
RData.rbd, RData.rbx, not as .rda files. That means the approach I use above is kinda worthless in that we want it to automatically be recognized.
EDIT2
This approach works and I tried it on a package I made. You can't do lazy load of the data and you have to either explicitly use data(scores) or use data(scores) inside of the function you're calling. I also assigned scores to .scores int he global.env the first time it was created and used exists inside the function to see if it exists. If `.scores. existed I assigned that to scores within the function. Once you unload the library and laod again you never have to worry about that again.
Maybe an alternative is to save this as a function somehow that can be altered using Josh's advice here: Permanently replacing a function
I guess there is no way to store settings without saving them to disk or a database, some way or another. It can be done silently though by putting the code below in your ~/.Rprofile. However, if you have packages that save settings in other ways than using options you need to add them manually.
I know this is exactly what you said you did not want, but it might spark some debate at least.
.Last <- function(){
my.options <- options()
save(my.options, file="~/.Roptions.Rdata")
}
.First <- function(){
tryCatch({
load("~/.Roptions.Rdata")
do.call(options, my.options)
rm(my.options)
}, error=function(...){})
}
To my suprise try(..., silent=TRUE) gives a warning on startup if ~/.Roptions.Rdata does not exist, which is why I used tryCatch instead.
The modern answer to this problem is well explained at https://blog.r-hub.io/2020/03/12/user-preferences/
I think I will be trying the hoardr package! Here is an example that worked for me :)
x <- hoardr::hoard()
x$cache_path_set("yourpackage", type = 'user_cache_dir')
x$mkdir()
scores<-data.frame(
user=c("one","two","three"),
score=c("500,200,1100")
)
save(scores,file = file.path(x$cache_path_get(), "scores.rdata"))
x$list()
x$details()
#new session
x <- hoardr::hoard()
x$cache_path_set("yourpackage", type = 'user_cache_dir')
x$list()
x$details()
load(file = file.path(x$cache_path_get(), "scores.rdata"))
PS - you can see a working example in the rnoaa package found on at github "opensci/rnoaa". Check their R/onload.r file! I can expand if needed.

Resources