Define Global Variables when creating packages - r

I have this problem. I am creating a new package with name "mypackagefunction" for R whose partial code is this
mypackagefunction<-function(){
##This is the constructor of my package
##1st step: define variables
gdata <<- NULL
#...
#below of this, there are more functions and code
}
So, I build and reload in R Studio and then check and in this step I receive this warning:
mypackagefunction: no visible binding for '<<-' assignment to ‘gdata’
But when I run my package with:
mypackagefunction()
I can use call that variable which is into the package with this results
> mypackagefunction()
> gdata
NULL
How can I remove this NOTE or Warning when I check my package? or another way to define Global Variables?

There are standard ways to include data in a package - if you want some particular R object to be available to the user of the package, this is what you should do. Data is not limited to data frames and matrices - any R object(s) can be included.
If, on the other hand, your intention was to modify the global environment every time a a function is called, then you're doing it wrong. In R's functional programming paradigm, functions return objects that can be assigned into the global environment by the user. Objects don't just "appear" in the global environment, with the programmer hoping that the user both (a) knows to look for them and (b) didn't have any objects of the same name that they wanted to keep (because they just got overwritten). It is possible to write code like this (using <<- as in your question, or explicitly calling assign as in #abhiieor's answer), but it will probably not be accepted to CRAN as it violates CRAN policy.

Another way of defining global variable is like assign('prev_id', id, envir = .GlobalEnv) where id is assignee variable or some value and prev_id is global variable

Related

How to include a closure in an R-package?

I would like to include a closure with the functions of an R package we are writing. The function (and its siblings) will have data in its environment, perform a comparison of input with the data, and return the result. To illustrate, think of a function with an inbuilt telephone directory: you query with a number and the function returns a name.
This function will be called as a helper by several other functions in our R package, so it has to exist once the package is loaded. And we want the function to be available in the package environment, just like any other function.
Should I create it via its factory function in .onLoad() and assign() it to the package environment? Could I ship it as an .RDS? Or RData, or does this violate CRAN policy on "binary executable code"? Or is there a different, canonical way? And where would the code and the data (or the RDS/RData) go in the package directory structure?
(I see that the question of how to document a closure has been discussed here).
For the benefit of anyone stumbling on this question. The solution I finally worked out involved a few steps but is "clean" as far as I can tell.
Put the factory function in a file R/aaa.R to ensure it gets loaded before the closure.
Put the data that the closure uses into the standard inst/extdata/ folder.
Put a file with the closure's name and proper docstring into R/: define the closure as a normal function that just returns nothing. This is necessary so the function is properly exported and known in the package namespace. Immediately call the factory function to create the closure and overwrite the original definition. Note: it's not enough to just bring the data into the factory function as an argument, it actually needs to be accessed before defining the closure. Why? That's because lazy loading won't actually have loaded the data into the environment you need it in unless you access it.
That's all. Summary: create a stub for your closure, then overwrite that with the return value of the factory function.
If the factory function is called later by the package user
but we still want the returned closure to be inside the package (for example if we don't want it to be changed by anything other than the factory, reliably accessible from within the package, documented etc..):
# exported function (visible to user)
# everything this function does is 'outsourced'
# to a non-exported function that we can overwrite with the factory:
visible_function(...){
hidden_function(...)
}
# not exported function (invisible to the user)
# called by the visible function
# fails unless factory is called first
hidden_function(x){
stop("call factory_fun() before you can use visible_function()")
}
# exported function, visible to the user.
# changes the hidden function called by the visible function
factory_function(x){
produced_function<-function(){
print(paste(x, "is an object forever stored in my namespace!"))
}
assignInNamespace("hidden_function",
produced_function,
ns="myPackageName")
}
Note that R CMD check throws a NOTE on assignInNamespace so CRAN won't easily accept this solution

R package can see variables not passed to it

I am writing a new R package and find that variables that I have not explicitly passed to a function in the package (as input argument) are visible within it, e.g.:
myFunc <- function(a,b,c) {
print(d)
}
where d is in the caller .R script, but has not been passed to myFunc, is visible.
Any help would be great, thanks; I'm using R 3.2.4 and have been using roxygen2 (via devtools::document()) to create the NAMESPACE if that helps.
Isn't this just a consequence of the scoping rules in R?
Your function defines a new myFunc environment. When you try to reference d in print(d), the interpreter first checks the myFunc environment for an object called d. Because no such object exists, the interpreter next checks the calling environment for an object called d. It finds the variable defined in your .R script and then prints it.
Here's a link with more info and a pile of examples.
Very useful link, thanks. It looks like forcing limited scoping within a function (i.e. getting a function to not access the global scope) is not a default property of R.
I found a similar question here: R force local scope
Using the checkStrict function posted by the main responder to that question seems to have worked; it found an unintended use of a global variable.
> require(myCustomPackage)
> checkStrict(showDendro)
Warning message:
In checkStrict(showDendro) : global variables used: palName
where showDendro is a function inside my custom package.
So it seems the solution to my problem is:
1) while you can stop R from moving up to the global environment by enclosing all your functions in the local() function , that seems like a tedious solution.
2) when moving code from the general environment into its own function, run something like checkStrict to remove unintended use of global variables.

How to create a constant inside R package?

How to create the constant variable inside the R package, the value of which cannot be changed? In other words, how can we lock the pair name-value in package environment?
Example: In my package I am using a quantile of Normal distribution in loops of different functions, and do not want to calculate (or create) it all the time.
I tried k_q3 <- qnorm(1 - 0.01/2); lockBinding("k_q3", environment()), but it does not work.
UPDATE: The method above actually is workable. One cannot change the k_q3 neither inside package, not outside.
The simplest and cleanest way would be to create a function, e.g.
K_Q3 <- function() { qnorm(1 - 0.01/2) }
Note that calling functions in R has a non-negligible overhead.
You should avoid calling it in loops, or copy it to a local variable before.
Just like you create function objects for your package by placing a .R file that defines them in the "R" folder within your package directory. You can simply assign a numeric value 2.575829 to a variable name--just don't export it.

Meaning of objects being masked by the global environment

When I load my package into the global environment, I get the following message
> library(saber)
Attaching package: ‘saber’
The following objects are masked _by_ ‘.GlobalEnv’:
load.schedule, teamStats
I don't know what that means, nor whether I should be concerned about it.
Why is this message being delivered, and what does it mean?
It means that you have objects (functions, usually) present in your global environment with the same name as (exported) things in your package. Type search() to see the order in which R resolves names.
The solution is to either,
don't create objects with those names in your global environment
rename the objects in your package to something that's less likely to create a conflict, or rethink your decision to export them, or
remember that you will always have to refer to those objects as saber::teamStats.
Probably (2) is best, unless the circumstances that led to the message are truly unusual.
There's a third implied question that I don't think has been fully answered for this particular case. How to fix it in the situation where an earlier version of your own function is stuck in the global environment and masking new versions you're trying to test?
Renaming your function with every rev is not practical in this situation. I had the same situation and found deleting the .Rdata file in the working directory before restarting R solved the problem.
This has happened to me only twice over hundreds of time assembling my packages. I'm still not sure how the functions are occasionally getting stuck in global.
This means that you have objects named load.schedule, teamStats in your workspace as well as in the library you are loading. It is warning you that when you call load.schedule it will use the one in your workspace (since it is first in the search path) rather than the one you are attaching. Try for example
ddply <- function(x) x + 1
library(plyr)
# Attaching package: ‘plyr’
#
# The following object is masked _by_ ‘.GlobalEnv’:
#
# ddply
ddply(3) # the one we just defined is used, as global env is first in the search path
#[1] 4
The answers above give the low-level causes.
I just thought it would be useful to point out I got that message when I had a project open in RStudio and I loaded the "same" library of that project.
With the explanations above its obvious doing this creates some kind of conflict.
The reason is you used the above two variable as local variable in your Rconsole. Since it is global variable you need to clean your existing project if not necessary else rename the local variable
In my case:
I declared a time series as gas. Later During calling forecast package i encountered same error and i renamed the library and used the package
You can also use the conflicted package.
https://cran.r-project.org/web/packages/conflicted/index.html
If you want saber to be used by default set:
conflicted::conflict_prefer("kbl", winner = "saber")
If you want your objects to be used, set:
conflicted::conflict_prefer("kbl", winner = ".GlobalEnv")

Load data object when package is loaded

Is there a way to automatically load a data object from a package in memory when the package is loaded (but not yet attached)? I.e. the opposite of lazy loading? The object is used in one of the package functions, so it needs to be available at all time.
When the package is set to lazydata=false, the data object is not exported by the package at all, and needs to be loaded manually with data(). We could use something like:
.onLoad <- function(lib, pkg){
data(mydata, package = pkg)
}
However, data() loads the object in the global environment. I strongly prefer to load it in the package environment (which is what lazydata does) to prevent masking conflicts.
A workaround is to bypass the data mechanics completely, and simply hardcode the object in the package. So the package myscore.R would look like
mymodel <- readRDS("inst/mymodel.rds")
myscore <- function(newdata){
predict(mymodel, newdata)
}
But this will lead to a huge packagedb for large data objects, and I am not sure what are the consequences of that.
As you say
The object is used in one of the package functions, so it needs to be available at all time.
I think the author of that package should really NOT use data(.) for that.
Instead he should define the object inside his /R/ either by simple R code in an R/*.R file,
or by using the sysdata.rda approach that is explained in the famous first reference for all these question,
"Writing R Extensions". In both cases the package author can also export the object which is often desirable for other users as in your case.
Of course this needs a polite conversation between you and the package author, and will only apply to the next version of that package.
I'm going to post this since it seems to work for my use case.
.onLoad() is:
function(lib,pkg)
data(mydata, package=pkg,
environment=parent.env(environment()))
Also need Imports: utils in DESCRIPTION and importFrom(utils, data) in NAMESPACE in order to pass R CMD check.
In my case I don't need the data object to be visible to the user, I need it to be visible to one of the functions in the package. If you need it visible to the user, that's going to be even harder (I think) because as far as I can tell you can't export data, just functions. The only way I've thought of to export data is to export a wrapper function for the data.

Resources