How to include a closure in an R-package? - r

I would like to include a closure with the functions of an R package we are writing. The function (and its siblings) will have data in its environment, perform a comparison of input with the data, and return the result. To illustrate, think of a function with an inbuilt telephone directory: you query with a number and the function returns a name.
This function will be called as a helper by several other functions in our R package, so it has to exist once the package is loaded. And we want the function to be available in the package environment, just like any other function.
Should I create it via its factory function in .onLoad() and assign() it to the package environment? Could I ship it as an .RDS? Or RData, or does this violate CRAN policy on "binary executable code"? Or is there a different, canonical way? And where would the code and the data (or the RDS/RData) go in the package directory structure?
(I see that the question of how to document a closure has been discussed here).

For the benefit of anyone stumbling on this question. The solution I finally worked out involved a few steps but is "clean" as far as I can tell.
Put the factory function in a file R/aaa.R to ensure it gets loaded before the closure.
Put the data that the closure uses into the standard inst/extdata/ folder.
Put a file with the closure's name and proper docstring into R/: define the closure as a normal function that just returns nothing. This is necessary so the function is properly exported and known in the package namespace. Immediately call the factory function to create the closure and overwrite the original definition. Note: it's not enough to just bring the data into the factory function as an argument, it actually needs to be accessed before defining the closure. Why? That's because lazy loading won't actually have loaded the data into the environment you need it in unless you access it.
That's all. Summary: create a stub for your closure, then overwrite that with the return value of the factory function.

If the factory function is called later by the package user
but we still want the returned closure to be inside the package (for example if we don't want it to be changed by anything other than the factory, reliably accessible from within the package, documented etc..):
# exported function (visible to user)
# everything this function does is 'outsourced'
# to a non-exported function that we can overwrite with the factory:
visible_function(...){
hidden_function(...)
}
# not exported function (invisible to the user)
# called by the visible function
# fails unless factory is called first
hidden_function(x){
stop("call factory_fun() before you can use visible_function()")
}
# exported function, visible to the user.
# changes the hidden function called by the visible function
factory_function(x){
produced_function<-function(){
print(paste(x, "is an object forever stored in my namespace!"))
}
assignInNamespace("hidden_function",
produced_function,
ns="myPackageName")
}
Note that R CMD check throws a NOTE on assignInNamespace so CRAN won't easily accept this solution

Related

Self-written R package does not find its own function

I created a package with some functions which are helpful at my company. Recently, I restructered the package such that there are helper functions which need not to be accessible for everyone, but are called internally from other (exported) functions of the package. These helper functions are not exported to the namespace (no #' #export in the respective .R files).
Now, when I call one of the "major" (exported) functions, I get the error message (no real function names):
Error in major_function() : could not find function "helper_function"
Im fairly new in building packages, but from what I understood so far (from https://cran.r-project.org/web/packages/roxygen2/vignettes/namespace.html), it should neither be necessary to export the helper functions, nor to add #' importFrom my_package helper_function to the .R file of the major function.
When I tried this, it actually produced errors when checking the package. I also tried to call the helper functions with my_package:::helper_function, but this lead to the note that it should almost never be necessary to call functions from the same package like this.
Maybe useful information:
The error occurs only when I call a major_function_1 which internally calls major_function_2 which calls a helper_function.
I think there is more to your problem than what you state. As long as all your functions are defined in the same namespace (this also means that all your functions need to live in .R files in the same folder), the calling function should find the helper-functions accordingly.
I suspect you have your helper functions nested in some way, and that is causing the problem.
I recommend to recheck your namespace structure, or post a simplistic outline of your package here.
Another reason that could come to mind, is that you do not export your 'mayor_function2' in your NAMESPACE-file in your package root (maybe you have not recompiled the Roxygen documentation generating this file), and additionally have a local shadow of the the calling function 'mayor_function1'. Try to check this and rerun from a clean compile.

With get() call a function from an unloaded package without using library

I want to call a function from an unloaded package by having the function name stored in a list.
Normally I would just use:
library(shiny)
pagelist <- list("type" = "p") # object with the function name (will be loaded from .txt file)
get(pagelist$type[1])("Display this text")
but since when writing a package you're not allowed to load the library I'd have to use something like
get(shiny::pagelist$type[1])("Display this text")
which doesn't work. Is there a way to call the function from the function name stored in the list, without having to load the library? Note that it should be possible to call many different functions like this (all from the same package), so just using e.g.
if (pagelist$type[1] == "p"){
shiny::p("Display this text")
}
would require a quite long list of if else statemens.
Use getExportedValue:
getExportedValue("shiny",pagelist$type[1])("Display this text")
#<p>Display this text</p>
You shouldn't use getExportedValue as was done in the accepted answer, because its help page describes the functions there as "Internal functions to support reflection on namespace objects." It's bad practice to use internal functions, because they can change in subtle ways with very little notice.
The right way to do the equivalent of shiny::p when both "shiny" and "p" are character strings in variables is to use get:
get("p", envir = loadNamespace("shiny"))
The loadNamespace function returns the exported environment of the package; it's fairly quick to execute if the package is already loaded.
The original question asked
Is there a way to call the function from the function name stored in
the list, without having to load the library?
(where I think "library" should be "package" in R jargon). The answer to this is "no", you can't get any object from a package unless you load the package. However, loading is simpler than attaching, so this won't put shiny on the search list (making all of shiny visible to the user), it's just loaded internally in R.
A related question is why get("shiny::p") doesn't work. The answer is that shiny::p is an expression to evaluate, and get only works on names.

R: How do I add an extra function to a package?

I would like to add an idiosyncratically modified function to package written by someone else, with an R Script, i.e. just for the session, not permanently. The specific example is, let's say, bls_map_county2() added to the blscrapeR package. bls_map_county2 is just a copy of the bls_map_county() function with an added ... argument, for purposes of changing a few of the map drawing parameters. I have not yet inserted the additional parameters. Running the function as-is, I get the error:
Error in BLS_map_county(map_data = df, fill_rate = "unemployed_rate", :
could not find function "geom_map"
I assume this is because my function does not point to the blscrapeR namespace. How do I assign my function to the (installed, loaded) blscrapeR namespace, and is there anything else I need to do to let it access whatever machinery from the package it requires?
When I am hacking on a function in a particular package that in turn calls other functions I often use this form after the definition:
mod_func <- function( args) {body hacked}
environment(mod_func) <- environment(old_func)
But I think the function you might really want is assignInNamespace. These methods will allow the access to non-exported functions in loaded packages. They will not however succeed if the package is not loaded. So you may want to have a stopifnot() check surrounding require(pkgname).
There are two parts to this answer - first a generic answer to your question, and 2nd a specific answer for the particular function that you reference, in which the problem is something slightly different.
1) generic solution to accessing internal functions when you edit a package function
You should already have access to the package namespace, since you loaded it, so it is only the unexported functions that will give you issues.
I usually just prepend the package name with the ::: operator to the non exported functions. I.e., find every instance of a call to some_internal_function(), and replace it with PackageName:::some_internal_function(). If there are several different internal functions called within the function you are editing, you may need to do this a few times for each of the offending function calls.
The help page for ::: does contain these warnings
Beware -- use ':::' at your own risk!
and
It is typically a design mistake to use ::: in your code since the
corresponding object has probably been kept internal for a good
reason. Consider contacting the package maintainer if you feel the
need to access the object for anything but mere inspection.
But for what you are doing, in terms of temporarily hacking another function from the same package for your own use, these warnings should be safe to ignore (at you own risk, of course - as it says in the manual)
2) In the case of blscrapeR ::bls_map_county()
The offending line in this case is
ggplot2::ggplot() + geom_map(...
in which the package writers have specified the ggplot2 namespace for ggplot(), but forgotten to do so for geom_map() which is also part of ggplot2 (and not an internal function in blscrapeR ).
In this case, just load ggplot2, and you should be good to go.
You may also consider contacting the package maintainer to inform them of this error.

Define Global Variables when creating packages

I have this problem. I am creating a new package with name "mypackagefunction" for R whose partial code is this
mypackagefunction<-function(){
##This is the constructor of my package
##1st step: define variables
gdata <<- NULL
#...
#below of this, there are more functions and code
}
So, I build and reload in R Studio and then check and in this step I receive this warning:
mypackagefunction: no visible binding for '<<-' assignment to ‘gdata’
But when I run my package with:
mypackagefunction()
I can use call that variable which is into the package with this results
> mypackagefunction()
> gdata
NULL
How can I remove this NOTE or Warning when I check my package? or another way to define Global Variables?
There are standard ways to include data in a package - if you want some particular R object to be available to the user of the package, this is what you should do. Data is not limited to data frames and matrices - any R object(s) can be included.
If, on the other hand, your intention was to modify the global environment every time a a function is called, then you're doing it wrong. In R's functional programming paradigm, functions return objects that can be assigned into the global environment by the user. Objects don't just "appear" in the global environment, with the programmer hoping that the user both (a) knows to look for them and (b) didn't have any objects of the same name that they wanted to keep (because they just got overwritten). It is possible to write code like this (using <<- as in your question, or explicitly calling assign as in #abhiieor's answer), but it will probably not be accepted to CRAN as it violates CRAN policy.
Another way of defining global variable is like assign('prev_id', id, envir = .GlobalEnv) where id is assignee variable or some value and prev_id is global variable

R package can see variables not passed to it

I am writing a new R package and find that variables that I have not explicitly passed to a function in the package (as input argument) are visible within it, e.g.:
myFunc <- function(a,b,c) {
print(d)
}
where d is in the caller .R script, but has not been passed to myFunc, is visible.
Any help would be great, thanks; I'm using R 3.2.4 and have been using roxygen2 (via devtools::document()) to create the NAMESPACE if that helps.
Isn't this just a consequence of the scoping rules in R?
Your function defines a new myFunc environment. When you try to reference d in print(d), the interpreter first checks the myFunc environment for an object called d. Because no such object exists, the interpreter next checks the calling environment for an object called d. It finds the variable defined in your .R script and then prints it.
Here's a link with more info and a pile of examples.
Very useful link, thanks. It looks like forcing limited scoping within a function (i.e. getting a function to not access the global scope) is not a default property of R.
I found a similar question here: R force local scope
Using the checkStrict function posted by the main responder to that question seems to have worked; it found an unintended use of a global variable.
> require(myCustomPackage)
> checkStrict(showDendro)
Warning message:
In checkStrict(showDendro) : global variables used: palName
where showDendro is a function inside my custom package.
So it seems the solution to my problem is:
1) while you can stop R from moving up to the global environment by enclosing all your functions in the local() function , that seems like a tedious solution.
2) when moving code from the general environment into its own function, run something like checkStrict to remove unintended use of global variables.

Resources