Hiding Undocumented Functions in a Package - Use of .function_name? - r

I've got some functions I need to make available in a package, and I don't want to export them or write documentation for them. I'd just hide them inside another function but they need to be available to several functions so doing so becomes a scoping and maintenance issue. What is the right way to do this? By that I mean do they need special names, do they go somewhere other than the R subdirectory, can I put them in a single file, etc? I've checked out the manuals, and what I'm after is like the .internals concept in the core, but I don't any instructions about how to do this generally. I thought I had seen something about this before but cannot locate it just now. Thx.

My solution is to remove unnecessary function from NAMESPACE and call internal function by NAME-OF-PACKAGE:::NAME-OF-INTERNAL-FUNCTION. For example if your package name is RP and the name of the internal function is IFC. Then it would be like RP:::IFC(). Notice that if you use :: (two colon)then you can call functions that are listed in NAMESPACE and when you use ::: (three colon), you can call all functions including internal and exported functions.

After asking on R-help, here is the answer. #Dwin is correct, do not export the internal functions (so fix up your export instructions in NAMESPACE - don't use exportPattern but rather name the functions explicitly using export). You can call them what you want, there is no special naming convention. You do not have to write Rd files for them if you don't export them.

Related

Can I automatically add functions called using pkg::fct to the importFrom section in roxygen2?

When I write a package, I usually call external functions using pkg::fct() just to make it very clear and explicit where the function is coming from. I'm aware of the small overhead, but can usually ignore it.
On the other hand I like it when all external functions appear in the roxygen tags to give an overview of what is used in the function.
Is there a way to automatically add all functions called via pkg::fct() to #importFrom? And is that a good idea after all?

Get all functions available at runtime

Is there a way to get all functions available at runtime? Or is there a hidden database to keep track of all the loaded functions, variables, modules accessible by our code.
I'll post my comment as an answer since it seemed useful:
Do you mean user-defined functions / variables and loaded modules? You can get those with whos(). To get functions exported from a module you can do whos(MyModule)

Best practices for developing a suite of dependent R packages

I am starting to work on a family of R packages, all of which share substantial common code which is housed in its own package, lets call it myPackageUtilities. So I have several packages
myPackage1, myPackage2, etc...
All of these packages depend on every method in myPackageUtilities. For a real-world example, please see statnet on CRAN. The idea is that a future developer might create myPackageN, and instead of having to re-write/duplicate all of the supporting code, this future developer can simply use mypackageUtilities to get started.
There are complications:
1) Some of the code in mypackageUtilities is intended for end-users, and the rest is for internal development purposes. The end-user code needs to be properly documented using roxygen2. This code includes both S3 classes and generics, as well as various helper functions for the user.
2) The dependent packages (myPackage1, myPackage2, etc.) will likely extend S3 generics defined in myPackageUtilities.
My question is: What is the best way to assemble all of this? Here are two natural (but non-exhuastive) options:
Include mypackageUtilities under Imports: for all the dependent packages, and force users to separately load mypackageUtilities,
Include mypackageUtilities under Depends: for all the dependent packages, and be very selective about what is exported from mypackageUtilities so as to avoid cluttering the search path. All of the internal (non-exported) code will have to accessed via ::: in myPackage1, etc.
I originally asked a similar question over here, but quickly discovered the situation gets complicated quickly. For example:
If we use Imports: instead of Depends:, then any generics defined in mypackageUtilities aren't found by myPackage1, etc.
This makes using the generic templates provided by mypackageUtilities difficult/impossible, and almost defeats the purpose of this entire set-up.
If I define an S3 generic in mypackageUtilities and document it there, how can I have roxygen2 reference these docs in myPackage1?
Perhaps I am deeply misunderstanding how namespaces work, in which case this would be a great place to clear up my misunderstanding!
Welcome to the rabbit hole.
You may be pleasantly surprised to learn that you can import a function from myPackageUtilities into myPackage1 and then export it from myPackage1 to make it accessible from the global environment.
So, when you say that you have a function in myPackageUtilities that should be accessible by the end user when myPackage1 is loaded, this is what I would include in my documentation for fn_name in myPackage1
#' #importFrom myPackageUtilities fn_name
#' #export fn_name
(See https://github.com/hadley/dplyr/blob/master/R/utils.r for an example)
That still leaves the question of how to link to the original documentation. And I'm afraid I don't have a good answer for that. My current practice is to, essentially, copy the parameters documentation from the original source and then in my #details section write please see the documentation for \code{\link[myPackageUtilities]{fn_name}}
In the end, I still think your best bet is to export everything from myPackageUtilities that will ever get used outside of myPackageUtilities and do a combination import-export in each package where you want a function from myPackageUtilities to be accessible from the global environment.

Is it necessary to export base method extensions in an R package? Documentation implications?

In principle, I could keep these extensions not-exported, and this would also allow me to not-add redundant documentation for these already well-documented methods, while still also passing R CMD check myPackage without any reported WARNINGs.
What are some of the drawbacks, if any? Is this possibly recommended to keep extensions of basic methods compartmentalized within the package that defines them? Alternatively, will this make it more difficult for another package to depend on mine, if certain core method-extensions are not exported?
For example, if I don't document and don't export the following:
setMethod("show", "myPackageSpecialClass", function(object){ show(NA) })
I'm trying to flesh-out some of these finer details of best-practices with namespaces and base method extensions.
If you don't export the methods, then users (either at the command line or trying to use your classes and methods in their own package via imports) won't be able to use them -- your class will be displayed with the show,ANY-method.
You are not documenting the generic show, but rather the method appropriate for your class, show,myPackageSpecialClass-method. If in your NAMESPACE you
import(methods)
exportMethods(show)
(note that there is no way to export just some methods on the generic show) and provide no documentation, R CMD check will complain
* checking for missing documentation entries ... WARNING
Undocumented S4 methods:
generic 'show' and siglist 'myPackageSpecialClass'
All user-level objects in a package (including S4 classes and methods)
should have documentation entries.
See the chapter 'Writing R documentation files' in the 'Writing R
Extensions' manual.
Your example (I know it was not meant to be a serious show method :) ) is a good illustration for why methods might be documented -- explaining to the user why every time they try and display the object they get NA, when they were expecting some kind of description about the object.
One approach to documentation is to group methods with the class into a single Rd file, myPackageSpecialClass-class.Rd. This file would contain an alias
\alias{show,myPackageSpecialClass-method}
and a Usage
\S4method{show}{myPackageSpacialClass}(object)
This works so long as no fancy multiple dispatch is used, i.e., it is clear to which class a method applies. If the user asks for help with ?show, they are always pointed toward the methods package help page. For help on your methods / class, they'd need to ask for that specific type of help. There are several ways of doing this but my favorite is
class ? myPackageSpecialClass
method ? "show,myPackageSpecialClass"
This will not be intuitive to the average user; the (class|method) ? ... formulation is not widely used, and the specification of "generic,signature" requires a lot of understanding about how S4 works, including probably a visit to selectMethod(show, "myPackageSpecialClass") (because the method might be implemented on a class that myPackageSpecialClass inherits from) or showMethods(class="myPackageSpecialClass", where=getNamespace("myPackage")) (because you're wondering what you can do with myPackageSpecialClass).

Where to hold PL/SQL constants?

Where do you normally store your PL/SQL constants? On the package-body level? In the specification? I've also seen some people holding constants in a specialized package just for constants. What are best practices in this area?
Thanks.
One downside to having constants in a package body or spec is that when you recompile the package, any user sessions that had the package state in the PGA would get ORA-04068. For this reason, in one large development environment we adopted the convention of having a separate spec-only package to hold the constants (and package globals if any) for each package. We'd then impose a rule saying that these spec-only packages were only allowed to be referenced by their "owning" package - which we enforced at code review. Not a perfect solution, but it worked for us at the time.
For the same reason, I'd never recommend one-constant-package-to-rule-them-all because every time someone needs to introduce a new constant, or modify an existing one, all user sessions get ORA-04068.
In many cases, you want to keep them in the specification so other packages can use them, especially as parameters when calling functions and procedures from your package.
Only when you want to keep them private to the package, you should put them int the body.
Having a package just for constants might be a good idea for those constants that are not related to any piece of code in particular, but relevant for the whole schema.
For our application, all constants are in a table. A simple function is used to extract them. No problem with recompilation, ORA-04068, ...
Best option in my opinion. Store the "constant" in a table and create a generic function to get the values. No 04068 😀
I would prefer the constants to be, by default, in the package body unless you use the constant as a parameter value for one of your public package procedures/functions or as a return value for your functions.
The problem with putting your constants in the package specification is that if you need to change the constant's type, other packages might fail that use the constant because it was just there. If the constant was private in the first place, then you don't need to perform an impact analysis for each change.
If you need to store contants like default language or stuff like that, then I would encapsulate those contants in functions like get_default_language etc. and keep the constants private.
I'd have a concern about having "one package to rule the constants" because package state -- constants, variables, and code -- get cached in the user's PGA at first invocation of any public variable or package. A package constant, if public, should be scoped to the package, and used by only by the methods of the package.
A constant whose scope spans packages should be in a code table with a description, joined in as required. Constant's aren't and variables don't, after all. Having a key-value-pair table of "constants" makes them all public, and makes changing them dynamically possible.
What if we use parameterless function named the same way as constant instead of using constant in the package. In that case we can add new function/constant to the package, change returning value of function or even remove some function from the package freely and we'll not get ORA 04068 after recompile it.
Inside implementation part of the function in the package body we can use constant as a returning value, though it's unnecessarily because we obviously can't change returning value. Also we can use in function signature some special performance technics, such as deterministic or maybe even result cash.
As a positive side effect we get ability to use constant in sql queries.

Resources