R CMD check NOTE: `No visible binding for global variable`, when defining a variable in an S3 generic and using it in a method - r

Consider the following piece of code:
greet <- function(object) {
greeting <- "hola"
UseMethod("greet", object)
}
greet.character <- function(object)
paste(greeting, object)
greet("stackoverflow")
#> [1] "hola stackoverflow"
Created on 2021-02-03 by the reprex package (v0.3.0)
Evidently, the execution environment of greet.character method is populated with the variables created in the generic, before UseMethod(), so that the code works as expected.
However, including these definitions in a package will lead to the following NOTE upon R CMD check:
> checking R code for possible problems ... NOTE
greet.character: no visible binding for global variable ‘greeting’
Undefined global functions or variables:
greeting
You can find an example package illustrating the issue at this repository.
Now, I am aware that I can shut up the NOTE with globalVariables(), but I am still puzzled about the origin of the NOTE in the first place. So my questions are:
Is there any intrinsic reason why the example above should be considered "bad practice" in an R package? In a real life example, I'm using a construct like this to avoid code repetition in several methods.
If not, is there any better option than using globalVariables()?

Answering myself, I think this practice can lead to confusing behaviour since UseMethod() generates a call of the appropriate method with its same arguments; implying that modifications to a variable which is an argument of the generic (not like greeting in my example above) are not propagated to the method.
Right now I'm defining extra functions to be called at the beginning of each method to perform the complex computation I was originally doing in the generic. This can also lead to cumbersome code if these functions require many arguments, but is the best solution I have found so far.
I will leave this open for some time to see if someone has other approaches to propose.

Related

Is it unwise to modify the class of functions in other packages?

There's a bit of a preamble before I get to my question, so hang with me!
For an R package I'm working on I'd like to make it as easy as possible for users to partially apply functions inline. I had the idea of using the [] operators to call my partial application function, which I've name "partialApplication." What I aim to achieve is this:
dnorm[mean = 3](1:10)
# Which would be exactly equivalent to:
dnorm(1:10, mean = 3)
To achieve this I tried defining a new [] method for objects of class function, i.e.
`[.function` <- function(...) partialApplication(...)
However, R gives a warning that the [] method for function objects is "locked." (Is there any way to override this?)
My idea seemed to be thwarted, but I thought of one simple solution: I can invent a new S3 class "partialAppliable" and create a [] method for it, i.e.
`[.partialAppliable` = function(...) partialApplication(...)
Then, I can take any function I want and append 'partialAppliable' to its class, and now my method will work.
class(dnorm) = append(class(dnorm), 'partialAppliable')
dnorm[mean = 3](1:10)
# It works!
Now here's my question/problem: I'd like users to be able to use any function they want, so I thought, what if I loop through all the objects in the active environment (using ls) and append 'partialAppliable' to the class of all functions? For instance:
allobjs = unlist(lapply(search(), ls))
#This lists all objects defined in all active packages
for(i in allobjs) {
if(is.function(get(i))) {
curfunc = get(i)
class(curfunc) = append(class(curfunc), 'partialAppliable')
assign(i, curfunc)
}
}
Voilà! It works. (I know, I should probably assign the modified functions back into their original package environments, but you get the picture).
Now, I'm not a professional programmer, but I've picked up that doing this sort of thing (globally modifying all variables in all packages) is generally considered unwise/risky. However, I can't think of any specific problems that will arise. So here's my question: what problems might arise from doing this? Can anyone think of specific functions/packages that will be broken by doing this?
Thanks!
This is similar to what the Defaults package did. The package is archived because the author decided that modifying other package's code is a "very bad thing". I think most people would agree. Just because you can do it, does not mean it's a good idea.
And, no, you most certainly should not assign the modified functions back to their original package environments. CRAN does not like when packages modify the users search path unnecessarily, so I would be surprised if they allowed a package to modify other package's function arguments.
You could work around that by putting all the modified functions in an environment on the search path. But then you have to ensure that environment is always searched first, which means modifying the search path every time another package is loaded.
Changing arguments for functions in other packages also has the potential to make it very difficult for others to reproduce your results because they must have all your argument settings. Unless you always call functions with all their arguments specified, which defeats the purpose of what you're trying to do.

R: How do I add an extra function to a package?

I would like to add an idiosyncratically modified function to package written by someone else, with an R Script, i.e. just for the session, not permanently. The specific example is, let's say, bls_map_county2() added to the blscrapeR package. bls_map_county2 is just a copy of the bls_map_county() function with an added ... argument, for purposes of changing a few of the map drawing parameters. I have not yet inserted the additional parameters. Running the function as-is, I get the error:
Error in BLS_map_county(map_data = df, fill_rate = "unemployed_rate", :
could not find function "geom_map"
I assume this is because my function does not point to the blscrapeR namespace. How do I assign my function to the (installed, loaded) blscrapeR namespace, and is there anything else I need to do to let it access whatever machinery from the package it requires?
When I am hacking on a function in a particular package that in turn calls other functions I often use this form after the definition:
mod_func <- function( args) {body hacked}
environment(mod_func) <- environment(old_func)
But I think the function you might really want is assignInNamespace. These methods will allow the access to non-exported functions in loaded packages. They will not however succeed if the package is not loaded. So you may want to have a stopifnot() check surrounding require(pkgname).
There are two parts to this answer - first a generic answer to your question, and 2nd a specific answer for the particular function that you reference, in which the problem is something slightly different.
1) generic solution to accessing internal functions when you edit a package function
You should already have access to the package namespace, since you loaded it, so it is only the unexported functions that will give you issues.
I usually just prepend the package name with the ::: operator to the non exported functions. I.e., find every instance of a call to some_internal_function(), and replace it with PackageName:::some_internal_function(). If there are several different internal functions called within the function you are editing, you may need to do this a few times for each of the offending function calls.
The help page for ::: does contain these warnings
Beware -- use ':::' at your own risk!
and
It is typically a design mistake to use ::: in your code since the
corresponding object has probably been kept internal for a good
reason. Consider contacting the package maintainer if you feel the
need to access the object for anything but mere inspection.
But for what you are doing, in terms of temporarily hacking another function from the same package for your own use, these warnings should be safe to ignore (at you own risk, of course - as it says in the manual)
2) In the case of blscrapeR ::bls_map_county()
The offending line in this case is
ggplot2::ggplot() + geom_map(...
in which the package writers have specified the ggplot2 namespace for ggplot(), but forgotten to do so for geom_map() which is also part of ggplot2 (and not an internal function in blscrapeR ).
In this case, just load ggplot2, and you should be good to go.
You may also consider contacting the package maintainer to inform them of this error.

plot.predict() or predict.plot() functions cited in literature, but no longer exists

I am trying to use predict.plot or plot.predict (I've seen both of them referenced on various websites and I don't know which one is right).
However, in R, neither of these are valid functions. I don't know if I'm missing a package or if the sources I'm using are outdated in terms of the functions being referenced.
This site is using the function
It's very old. Can someone familiar with stats help me figure out how to do this with the latest version of R, or help me figure out how to get predict.plot/plot.predict to work?
This is a user defined function. The author of the course you linked to provides it online:
Link to R script
The S3 dispatch system for functions in R examines the class of the first argument and then calls a function with the name func.class. In this case the author has defined several plot.predict functions: predict.plot.data.frame, predict.plot.lm, and predict.plot.formula which are then given the arguments on hte basis of the class of what is (first) in the argument list. The plot.predict function is just this:
predict.plot <- function(object, ...) UseMethod("predict.plot")
The "good stuff" is in the other three functions at the link Roland provided. I do think the author's use of a dot in the name for the generic function is a bit confusing. One might have expected there to be a class "plot" for which there was a generic function predict, but that is not really the case here, although you might find it interesting to just type: methods(predict) .After you had loaded the R script on that website you could find those various functions using:
methods(predict.plot)
#[1] predict.plot.data.frame predict.plot.formula predict.plot.lm

Building R package: no visible global function definition for 'subject'

I'm building an R package for the first time and am having some trouble. I am doing an R CMD Check and am getting the following error:
get.AlignedPositions: no visible global function definition for 'subject'
I am not sure what is causing this. I don't even have a "subject" variable in my code. The code is rather lengthy so I rather not paste all of it unless someone asks in a comment. Is there something specific I should look for? The only thing I can think of is that I have a line like this:
alignment <-pairwiseAlignment(pattern = canonical.protein, subject=protein.extracted, patternQuality=patternQuality,
subjectQuality=subjectQuality,type = type, substitutionMatrix= substitutionMatrix,
fuzzyMatrix=fuzzyMatrix,gapOpening=gapOpening,gapExtension=gapExtension,
scoreOnly=scoreOnly)
but subject is defined by the pairwiseAlignment function in the Biostrings package. Thank you for your help!
R spotted a function, subject, being used without a function called subject being available. One possible reason for this is explained in this discussion on R-devel. In that case code is used conditionally, e.g. if a certain package is installed we use its functionality. When checking the package on a system which does not have this package installed, we run in to these kinds of warnings. So please check if this might be the case. Alternatively, you might have made a mistake by calling subject while no function existed, e.g. subject was not a function but just an object.

Can we have more error (messages)?

Is there a way, in R, to pop up an error message if a function uses a variable
not declared in the body of the function: i.e, i want someone to flag this type of functions
aha<-function(p){
return(p+n)
}
see; if there happens to be a "n" variable lying somewhere, aha(p=2) will give me an "answer" since R will just take "n" from that mysterious place called the "environment"
If you want to detect such potential problems during the code-writing phase and not during run-time, then the codetools package is your friend.
library(codetools)
aha<-function(p){
return(p+n)
}
#check a specific function:
checkUsage(aha)
#check all loaded functions:
checkUsageEnv(.GlobalEnv)
These will tell you that no visible binding for global variable ‘n’.
Richie's suggestion is very good.
I would just add that you should consider creating unit test cases that would run in a clean R environment. That will also eliminate the concern about global variables and ensures that your functions behave the way that they should. You might want to consider using RUnit for this. I have my test suite scheduled to run every night in a new environment using RScript, and that's very effective and catching any kind of scope issues, etc.
Writing R codes to check other R code is going to be tricky. You'd have to find a way to determine which bits of code were variable declarations, and then try and work out whether they'd already been declared within the function.
EDIT: The previous statement would have been true, but as Aniko pointed out, the hard work has already been done in the codetools package.
One related thing that may be useful to you is to force a variable to be taken from the function itself (rather than from an enclosing environment).
This modified version of your function will always fail, since n is not declared.
aha <- function(p)
{
n <- get("n", inherits=FALSE)
return(p+n)
}

Resources