how to use utils::globalVariables - r

Following your recommendations (or trying to do it, at least), I have tried some options, but the problem remains, so there must be something I am missing.
I have included a more complete code
setwd("C:/naapp")
#' #import utils
#' #import devtools
I have tried with and without using suppressForeignCheck
if(getRversion() >= "2.15.1"){
utils::globalVariables(c("eleven"))
utils::suppressForeignCheck(c("eleven"))
}
myFunctionSum <- function(X){print(X+eleven)}
myFunctionMul <- function(X){print(X*eleven)}
myFunction11 <- function(X){
assign("eleven",11,envir=environment(myFunctionMul))
}
maybe I should use a particular environment?
package.skeleton(name = "myPack11", list=ls(),
path = "C:/naapp", force = TRUE,
code_files = character())
I remove the "man" directory from the directory myPack11,
otherwise I would get an error because the help files are empty.
I add the imports utils, and devtools to the descrption
Then I run check
devtools::check("myPack11")
And I still get this note
#checking R code for possible problems ... NOTE
#myFunctionMul: no visible binding for global variable 'eleven'
#myFunctionSum: no visible binding for global variable 'eleven'
#Undefined global functions or variables:eleven
I have tried also to make an enviroment, combining Tomas Kalibera's suggetion and an example I found in the Internet.
myEnvir <- new.env()
myEnvir$eleven <- 11
etc
In this case, I get the same note, but with "myEnvir", instead of "eleven"
First version of the question
I trying to use "globalVariables" from the package utils. I am building an interface in R and I am planning to submit to CRAN. This is my first time, so, sorry if the question is very basic.I have read the help and I have tried to find examples, but I still don't know how to use it.
I have made a little silly example to ilustrate my question, which is:
Where do I have to place this line exactly?:
if(getRversion() >= "2.15.1"){utils::globalVariables("eleven")}
My example has three functions. myFunction11 creates the global variable "eleven" and the other two functions manipulate it. In my real code, I cannot use arguments in the functions that are called by means of a button. Consider that this is just a silly example to learn how to use globalVariables (to avoid binding notes).
myFunction11 <- function(){
assign("eleven",11,envir=environment(myFunctionSum))
}
myFunctionSum <- function(X){
print(X+eleven)
}
myFunctionMul <- function(X){
print(X*eleven)
}
Thank you in advance

I thought that the file globals.R would be automatically generated when using globalsVariables. The problem was that I needed to create the package skeleton, then create the file globals.R, add it to the R directory in the package and check the package.
So, I needed to place this in a different file:
#' #import utils
utils::globalVariables(c("eleven"))
and save it

The documentation clearly says:
## In the same source file (to remind you that you did it) add:
if(getRversion() >= "2.15.1") utils::globalVariables(c(".obj1", "obj2"))
so put it in the same source file as your functions. It can go in any of your R source files, but the comment above recommends you put it close to your code. Looking at a bunch of github packages reveals another common pattern is to have a globals.R function with it in, but this is probably a bad idea. If you later remove the global from your package but neglect to update globals.R you could mask a problem. Putting it right close to the functions that use it will hopefully remind you when you edit those functions.
Make sure you put it outside any function definitions in the file, or it won't get seen.

You cannot modify bindings in a package namespace once the package is loaded (and namespace sealed, and bindings locked). The check tool helps you to spot violations of this restriction, so you find out about the problem when checking the package rather than while running it. globalVariables is just a call to silence check when looking for these violations, which is undesirable in almost all cases. If you really need mutable state in a package, you can create a new environment (using new.env) and bind it to an (unexported) "global" variable in your namespace. This binding will be locked, but this is ok, because in R you can change an environment in place (add/remove elements, effectively modifying the elements).
The best situation is however when you can keep all mutable state in user objects (passed in as arguments into functions, and their modified versions returned as output values of functions).

Related

Is it unwise to modify the class of functions in other packages?

There's a bit of a preamble before I get to my question, so hang with me!
For an R package I'm working on I'd like to make it as easy as possible for users to partially apply functions inline. I had the idea of using the [] operators to call my partial application function, which I've name "partialApplication." What I aim to achieve is this:
dnorm[mean = 3](1:10)
# Which would be exactly equivalent to:
dnorm(1:10, mean = 3)
To achieve this I tried defining a new [] method for objects of class function, i.e.
`[.function` <- function(...) partialApplication(...)
However, R gives a warning that the [] method for function objects is "locked." (Is there any way to override this?)
My idea seemed to be thwarted, but I thought of one simple solution: I can invent a new S3 class "partialAppliable" and create a [] method for it, i.e.
`[.partialAppliable` = function(...) partialApplication(...)
Then, I can take any function I want and append 'partialAppliable' to its class, and now my method will work.
class(dnorm) = append(class(dnorm), 'partialAppliable')
dnorm[mean = 3](1:10)
# It works!
Now here's my question/problem: I'd like users to be able to use any function they want, so I thought, what if I loop through all the objects in the active environment (using ls) and append 'partialAppliable' to the class of all functions? For instance:
allobjs = unlist(lapply(search(), ls))
#This lists all objects defined in all active packages
for(i in allobjs) {
if(is.function(get(i))) {
curfunc = get(i)
class(curfunc) = append(class(curfunc), 'partialAppliable')
assign(i, curfunc)
}
}
Voilà! It works. (I know, I should probably assign the modified functions back into their original package environments, but you get the picture).
Now, I'm not a professional programmer, but I've picked up that doing this sort of thing (globally modifying all variables in all packages) is generally considered unwise/risky. However, I can't think of any specific problems that will arise. So here's my question: what problems might arise from doing this? Can anyone think of specific functions/packages that will be broken by doing this?
Thanks!
This is similar to what the Defaults package did. The package is archived because the author decided that modifying other package's code is a "very bad thing". I think most people would agree. Just because you can do it, does not mean it's a good idea.
And, no, you most certainly should not assign the modified functions back to their original package environments. CRAN does not like when packages modify the users search path unnecessarily, so I would be surprised if they allowed a package to modify other package's function arguments.
You could work around that by putting all the modified functions in an environment on the search path. But then you have to ensure that environment is always searched first, which means modifying the search path every time another package is loaded.
Changing arguments for functions in other packages also has the potential to make it very difficult for others to reproduce your results because they must have all your argument settings. Unless you always call functions with all their arguments specified, which defeats the purpose of what you're trying to do.

How to check for accidentally redefining a function name in R

I'm writing some fairly involved R code spread across multiple files and collected together into a package. A problem I've run into on occasion is that I will define a utility function in one file that has the same name as another utility function defined in another file. One of the two definitions gets replaced, leading to unintended behavior. Is there any sort of tool to check for this kind of accidental redefinition? Something that would check that no two top-level assignments foo <- ... in the package assign to the same name?
As pointed out in the comments, the right way to do this is to use packages. Packages give functions their own namespaces automatically, plus they make it very easy to reuse and share code. If you're using RStudio, you can create one with very little effort from the New Project menu.
However, if you can't use packages or namespaces for some reason, there's still a way to do what you want: you can lock a variable (including a function) so that it's not possible to overwrite it.
> pin <- 11
> lockBinding("pin", .GlobalEnv)
> pin <- 12
Error: cannot change value of locked binding for 'pin'
See Binding and Environment Locking for details.

Changing the load order of files in an R package

I'm writing a package for R in which the exported functions are decorated by a higher-order function that adds error checking and some other boilerplate code.
However, because this code is at the top-level it is evaluated after parsing. These means that
the load order of the package files is important.
To give an equivalent but simplified example, suppose I have a package with two files (Negate2 and Utils), and I require Negate2.R to be loaded first for the function 'isfalse( )' to be defined without throwing an error.
# /Negate2.R
Negate2 <- Negate
# -------------------
# /Utils.R
istrue <- isTRUE
isfalse <- Negate2(istrue)
Is it possible to structure NAMESPACE, DESCRIPTION (collate) or another package file in order to change the load order of files? The internal working of the R package structure and CRAN are still black magic to me.
It is possible to get around this problem using awkward hacks, but the least repetitive way of solving this problem. The wrapper function must be a higher-order function, since it also changes the function call semantics of its input function. The package is code heavy (~6000 lines, 100 functions) so repetition would be...problematic.
Solution
As #Manetheran points out, to change the load order you just change the order of the file names in the DESCRIPTION file.
# /DESCRIPTION
Collate:
'Negate2.R'
'Utils.R'
The Collate: field of the DESCRIPTION file allows you to change the order files are loaded when the package is built.
I stumbled across the answer to this question yesterday while reading up on Roxygen. If you've been documenting your functions with Roxygen, it can try to intelligently order your R source files in the Collate: field (based on where S4 class and method definitions are). This can be done by adding "collate" to the roclets argument of roxygenize. Alternatively if you're developing in RStudio there is a simple box that can be checked under Build->Configure Build Tools->Configure... (Button next to "Generate documentation with Roxygen").
R loads files in alphabetical order. To change the order, Collate field could be used from the DESCRIPTION file.
roxygen2 provides an explicit way of saying that one file must be loaded before another: #include. The #include tag gives a space separated list of file names that should be loaded before the current file:
#' #include class-a.r
setClass("B", contains = "A")
If any #include tags are present in the package, roxygen2 will set the Collate field in the DESCRIPTION.
You need to run generation of roxygen2 documentation in order to changes to take effect.

Adding a function in R for use without loading it

So I have code for a custom function in R, e.g. simply:
f <- function(x) return(x)
What I want to do is be able to call this function f as if it came with default R. That is, I don't want to have to source() the file containing the code to be able to use the function. Is there a way to accomplish this?
What you are looking to do is documented extensively under ?Startup. Basically, you need to customize your Rprofile.site or .Rprofile to include the functions that you want available on startup. A simple guide can be found at the Quick-R site.
One thing to note: if you are commonly sharing code with others, you do need to remember what you've changed in your startup options and be sure that the relevant functions are also shared.

Dynamically Generate Reference Classes

I'm attempting to generate reference classes within an R package on the fly, and it's proving to be fairly difficult. Here are the approaches I've taken and problems I've run into:
I'm creating a package in which I hope to be able to dynamically read in a schema and automatically generate an associated reference class (think SOAP). Of course, this means I won't be able to define my reference classes before-hand in the package sources.
I initially attempted to create a new class using a simple:
myClass <- setRefClass("NewClassName", fields=list(fieldA="character"))
which, of course, works fine when executed interactively, but when included in the package sources, I get a locked binding error. From my reading, it looks like this occurs because when running interactively, the class information is being stored in the global environment, which is not locked, while my package's base environment is locked.
I then found a thread that suggested using something to the effect of:
myClass <- setRefClass("NewClassName", fields=list(fieldA="character"), where=globalenv())
This actually crashed R/Studio when I tried to build the package, so I don't have a log of the error it generated, unfortunately, but it certainly didn't work.
Next I tried creating a new environment within my package which I could use to store these reference classes. So I added a .classEnv <- new.env() line in my package sources (not inside of any function) and then attempted to use this class when creating a new reference class:
myClass <- setRefClass("NewClassName", fields=list(fieldA="character"), where=.classEnv)
This actually seemed to work OK, but generates the following warning:
> myClass <- setRefClass("NewClassName", where=.classEnv)
Warning message:
In getPackageName(where) :
Created a package name, ‘2013-04-23 10:19:14’, when none found
So, for some reason, methods::getPackageName() isn't able to pick up which package my new environment is in?
Is there a way to create my new environment differently so that getPackageName() can properly recognize the package? Can I add some feature which allows me to help getPackageName() detect the package? Will this even work if I can deal with the warning, or am I misusing reference classes by trying to create them dynamically?
To get the conversation going, I found that getpackageName stores the package name in a hidden .packageName variable in the specified environment.
So you can actually get around the warning with
assign(".packageName", "MyPkg", envir=.classEnv)
myClass <- setRefClass("NewClassName", fields=classFields, where=.classEnv)
which resolves the warning, but the documentation says not to trust the .packageName variable indefinitely, and I still feel like I'm hacking this in and may be misunderstanding something important about reference classes and their relationship to environments.
Full details from documentation:
Package names are normally installed during loading of the package, by the INSTALL script or by the library function. (Currently, the name is stored as the object .packageName but don't trust this for the future.)
Edit:
After reading a little further, the setPackageName method may be a more reliable way to set the package name for the environment. Per the docs:
setPackageName can be used to establish a package name in an environment that would otherwise not have one. This allows you to create classes and/or methods in an arbitrary environment, but it is usually preferable to create packages by the standard R programming tools (package.skeleton, etc.)
So it looks like one valid solution would be the following:
setPackageName("MyPkg", .classEnv)
myClass <- setRefClass("NewClassName", fields=classFields, where=.classEnv)
That eliminates the warning message and doesn't rely on anything that's documented as unstable. I'm still not clear why it's necessary, but...

Resources