modify the body text of existing function objects - r

I have some .Rdata files that contain saved functions as defined by approxfun().
Some of the save files pre-date the change to approxfun from package "base" to "stats", and so the body has
PACKAGE = "base"
and the wrong package causes the function to fail. I can fix(myfun) and simply replace "base" with "stats", but I want a neater automatic way.
Can I do this with gsub() and body() somehow?
I can get the body text and substitute there with
as.character(body(myfun))
but I don't know how to turn that back into a "call" and replace the definition.
(I know that a better solution is to have saved the data originally used by approxfun and simply recreate the function, but I wonder if there's a sensible way to modify the existing one.)
Edit: I found it here
What ways are there to edit a function in R?

Use the substitute function.
For example:
myfun <- function(x,y) {
result <- list(x+y,x*y)
return(result)
}
Using body, treat myfun as a list to select what you would like to change in the function:
> body(myfun)[[2]][[3]][[2]]
x + y
When you change this, you must use the substitute function so you replace the part of the function with a call or name object, as appropriate. Replacing with character strings doesn't work since functions are not stored as or operated on as character strings.
body(myfun)[[2]][[3]][[2]] <- substitute(2*x)
Now the selected piece of the function has been replaced:
> myfun
function (x, y)
{
result <- list(2 * x, x * y)
return(result)
}

Related

Passing values between functions in an R package

In an R package, let's say we have two functions. One is setting some parameters; the other one is using those parameters. How can I build such a pattern in R. It is similar to event-driven applications. But I am not sure if it is possible in R or not.
For example:
If we run set_param( a=10), whenever we run print_a.R, it prints 10, and incase of running set_param(a=20), it prints 20.
I need a solution without assigning value to the global environment because CRAN checks raise notes.
I suggest adding a variable to your package, as #MrFlick suggested.
For instance, in ./R/myoptions.R:
.myoptions <- new.env(parent = emptyenv())
getter <- function(k) {
.myoptions[[k]]
}
setter <- function(k, v) {
.myoptions[[k]] <- v
}
lister <- function() {
names(.myoptions)
}
Then other package functions can use this as a key/value store:
getter("optA")
# NULL
setter("optA", 99)
getter("optA")
# [1] 99
lister()
# [1] "optA"
and all the while, nothing is in the .GlobalEnv:
ls(all.names = TRUE)
# character(0)
Values can be as complex as you want.
Note that these are not exported, so if you want/need the user to have direct access to this, then you'll need to update NAMESPACE or, if using roxygen2, add #' #export before each function definition.
NB: I should add that a more canonical approach might be to use options(.) for these, so that users can preemptively control and have access to them., programmatically.

Unit testing functions with global variables in R

Preamble: package structure
I have an R package that contains an R/globals.R file with the following content (simplified):
utils::globalVariables("COUNTS")
Then I have a function that simply uses this variable. For example, R/addx.R contains a function that adds a number to COUNTS
addx <- function(x) {
COUNTS + x
}
This is all fine when doing a devtools::check() on my package, there's no complaining about COUNTS being out of the scope of addx().
Problem: writing a unit test
However, say I also have a tests/testthtat/test-addx.R file with the following content:
test_that("addition works", expect_gte(fun(1), 1))
The content of the test doesn't really matter here, because when running devtools::test() I get an "object 'COUNTS' not found" error.
What am I missing? How can I correctly write this test (or setup my package).
What I've tried to solve the problem
Adding utils::globalVariables("COUNTS") to R/addx.R, either before, inside or after the function definition.
Adding utils::globalVariables("COUNTS") to tests/testthtat/test-addx.R in all places I could think of.
Manually initializing COUNTS (e.g., with COUNTS <- 0 or <<- 0) in all places of tests/testthtat/test-addx.R I could think of.
Reading some examples from other packages on GitHub that use a similar syntax (source).
I think you misunderstand what utils::globalVariables("COUNTS") does. It just declares that COUNTS is a global variable, so when the code analysis sees
addx <- function(x) {
COUNTS + x
}
it won't complain about the use of an undefined variable. However, it is up to you to actually create the variable, for example by an explicit
COUNTS <- 0
somewhere in your source. I think if you do that, you won't even need the utils::globalVariables("COUNTS") call, because the code analysis will see the global definition.
Where you would need it is when you're doing some nonstandard evaluation, so that it's not obvious where a variable comes from. Then you declare it as a global, and the code analysis won't worry about it. For example, you might get a warning about
subset(df, Col1 < 0)
because it appears to use a global variable named Col1, but of course that's fine, because the subset() function evaluates in a non-standard way, letting you include column names without writing df$Col.
#user2554330's answer is great for many things.
If I understand correctly, you have a COUNTS that needs to be updateable, so putting it in the package environment might be an issue.
One technique you can use is the use of local environments.
Two alternatives:
If it will always be referenced in one function, it might be easiest to change the function from
myfunc <- function(...) {
# do something
COUNTS <- COUNTS + 1
}
to
myfunc <- local({
COUNTS <- NA
function(...) {
# do something
COUNTS <<- COUNTS + 1
}
})
What this does is create a local environment "around" myfunc, so when it looks for COUNTS, it will be found immediately. Note that it reassigns using <<- instead of <-, since the latter would not update the different-environment-version of the variable.
You can actually access this COUNTS from another function in the package:
otherfunc <- function(...) {
COUNTScopy <- get("COUNTS", envir = environment(myfunc))
COUNTScopy <- COUNTScopy + 1
assign("COUNTS", COUNTScopy, envir = environment(myfunc))
}
(Feel free to name it COUNTS here as well, I used a different name to highlight that it doesn't matter.)
While the use of get and assign is a little inconvenient, it should only be required twice per function that needs to do this.
Note that the user can get to this if needed, but they'll need to use similar mechanisms. Perhaps that's a problem; in my packages where I need some form of persistence like this, I have used convenience getter/setter functions.
You can place an environment within your package, and then use it like a named list within your package functions:
E <- new.env(parent = emptyenv())
myfunc <- function(...) {
# do something
E$COUNTS <- E$COUNTS + 1
}
otherfunc <- function(...) {
E$COUNTS <- E$COUNTS + 1
}
We do not need the get/assign pair of functions, since E (a horrible name, chosen for its brevity) should be visible to all functions in your package. If you don't need the user to have access, then keep it unexported. If you want users to be able to access it, then exporting it via the normal package mechanisms should work.
Note that with both of these, if the user unloads and reloads the package, the COUNTS value will be lost/reset.
I'll list provide a third option, in case the user wants/needs direct access, or you don't want to do this type of value management within your package.
Make the user provide it at all times. For this, add an argument to every function that needs it, and have the user pass an environment. I recommend that because most arguments are passed by-value, but environments allow referential semantics (pass by-reference).
For instance, in your package:
myfunc <- function(..., countenv) {
stopifnot(is.environment(countenv))
# do something
countenv$COUNT <- countenv$COUNT + 1
}
otherfunc <- function(..., countenv) {
countenv$COUNT <- countenv$COUNT + 1
}
new_countenv <- function(init = 0) {
E <- new.env(parent = emptyenv())
E$COUNT <- init
E
}
where new_countenv is really just a convenience function.
The user would then use your package as:
mycount <- new_countenv()
myfunc(..., countenv = mycount)
otherfunc(..., countenv = mycount)

Referring to package and function as arguments in another function

I am trying to find methods for specific functions across different packages in R. For example methods(broom::tidy) will return all methods for the function tidy in the package broom. For my current issue it would be better if I could have the methods function in another function like so:
f1 <- function(x,y){
methods(x::y)
}
(I removed other parts of the code that are not relevant to my issue.)
However when I run the function like this:
f1 <- function(x,y){ methods(x::y)}
f1(broom,tidy)
I get the error
Error in loadNamespace(name) : there is no package called ‘x’
If I try to modify it as to only change the function but keep the package the same I get a similar error :
f2 <- function(y){ methods(broom::y)}
f2(tidy)
Error: 'y' is not an exported object from 'namespace:broom'
How can I get the package and function name to evaluate properly in the function? Does this current issue have to do with when r is trying to evaluate/substitute values in the function?
Both the :: and methods() functions use non-standard evaluation in order to work. This means you need to be a bit more clever with passing values to the functions in order to get it to work. Here's one method
f1 <- function(x,y){
do.call("methods", list(substitute(x::y)))
}
f1(broom,tidy)
Here we use substitute() to expand and x and y values we pass in into the namespace lookup. That solves the :: part which you can see with
f2 <- function(x,y){
substitute(x::y)
}
f2(broom,tidy)
# broom::tidy
We need the substitute because there could very well be a package x with function y. For this reason, variables are not expanded when using ::. Note that :: is just a wrapper to getExportedValue() should you otherwise need to extract values from namespaces using character values.
But there is one other catch: methods() doesn't evaluate it's parameters, it uses the raw expression to find the methods. This means we don't actually need the value of broom::tidy, we to pass that literal expression. Since we need to evaluate the substitute to get the expression we need, we need to build the call with do.call() in order to evaluate the substitute and pass that expression on to methods()

How do you re-write the rm() function in R to clear your workspace automatically [duplicate]

I am trying to find a way to clear the workspace in R using lists.
According to the documentation, I could simply create a vector with all my workspace objects: WS=c(ls()). But nothing happens when I try element wise deletion with rm(c(ls()) or rm(WS).
I know I can use the command rm(list=ls()). I am just trying to figure how R works. Where did I err in my thinking in applying the rm() function on a vector with the list of the objects?
Specifically, I'm trying to create a function similar to the clc function in MATLAB, but I am having trouble getting it to work. Here's the function that I've written:
clc <- function() { rm(list = ls()) }
From ?rm, "Details" section:
Earlier versions of R incorrectly claimed that supplying a character vector in ... removed the objects named in the character vector, but it removed the character vector. Use the list argument to specify objects via a character vector.
Your attempt should have been:
rm(list = WS)
HOWEVER, this will still leave you with an object (a character vector) named "WS" in your workspace since that was created after you called WS <- c(ls()). To actually get rid of the "WS" object, you would have had to use rm(WS, list = WS). :-)
How does it work? If you look at the code for rm, the first few lines of the function captures any individual objects that have been specified, whether quoted or unquoted. Towards the end of the function, you will find the line list <- .Primitive("c")(list, names) which basically creates a character vector of all of the objects individually named and any objects in the character vector supplied to the "list" argument.
Update
Based on your comment, it sounds like you're trying to write a function like:
.clc <- function() {
rm(list = ls(.GlobalEnv), envir = .GlobalEnv)
}
I think it's a little bit of a dangerous function, but let's test it out:
ls()
# character(0)
for (i in 1:5) assign(letters[i], i)
ls()
# [1] "a" "b" "c" "d" "e" "i"
.clc()
ls()
# character(0)
Note: FYI, I've named the function .clc (with a dot) so that it doesn't get removed when the function is run. If you wanted to write a version of the function without the ., you would probably do better to put the function in a package and load that at startup to have the function available.

Understanding element wise clearing of R's workspace

I am trying to find a way to clear the workspace in R using lists.
According to the documentation, I could simply create a vector with all my workspace objects: WS=c(ls()). But nothing happens when I try element wise deletion with rm(c(ls()) or rm(WS).
I know I can use the command rm(list=ls()). I am just trying to figure how R works. Where did I err in my thinking in applying the rm() function on a vector with the list of the objects?
Specifically, I'm trying to create a function similar to the clc function in MATLAB, but I am having trouble getting it to work. Here's the function that I've written:
clc <- function() { rm(list = ls()) }
From ?rm, "Details" section:
Earlier versions of R incorrectly claimed that supplying a character vector in ... removed the objects named in the character vector, but it removed the character vector. Use the list argument to specify objects via a character vector.
Your attempt should have been:
rm(list = WS)
HOWEVER, this will still leave you with an object (a character vector) named "WS" in your workspace since that was created after you called WS <- c(ls()). To actually get rid of the "WS" object, you would have had to use rm(WS, list = WS). :-)
How does it work? If you look at the code for rm, the first few lines of the function captures any individual objects that have been specified, whether quoted or unquoted. Towards the end of the function, you will find the line list <- .Primitive("c")(list, names) which basically creates a character vector of all of the objects individually named and any objects in the character vector supplied to the "list" argument.
Update
Based on your comment, it sounds like you're trying to write a function like:
.clc <- function() {
rm(list = ls(.GlobalEnv), envir = .GlobalEnv)
}
I think it's a little bit of a dangerous function, but let's test it out:
ls()
# character(0)
for (i in 1:5) assign(letters[i], i)
ls()
# [1] "a" "b" "c" "d" "e" "i"
.clc()
ls()
# character(0)
Note: FYI, I've named the function .clc (with a dot) so that it doesn't get removed when the function is run. If you wanted to write a version of the function without the ., you would probably do better to put the function in a package and load that at startup to have the function available.

Resources