I'm writing a package. How can make it such that when library(my_package) is called, other packages are loaded as well? - r

Title should be pretty clear I hope. I'm writing a package called forecasting, with imports for dplyr among other packages. With the imports written in to the DESCRIPTION file, I am able to force these other packages to be installed along with forecasting - is there an equivalent way to do this for the loading of the package? In other words, is there a way that when I load my package with library(forecasting), it automatically also loads dplyr and the other packages?
Thanks

Yes.
Re-read "Writing R Extensions". The Depends: forces both the initial installation as well as the loading of the depended-upon packages.
But these days you want Imports: along with importFrom() in the NAMESPACE file which is more fine-grained.
But first things first: get it working with Depends.

Edit:
Opps you're correct, the documentation I referenced is not a primary source. Perhaps this is better:
From the R documentation:
The ‘Depends’ field gives a comma-separated list of package names which this package depends on. Those packages will be attached before the current package when library or require is called.
and
The ‘Imports’ field lists packages whose namespaces are imported from (as specified in the NAMESPACE file) but which do not need to be attached. Namespaces accessed by the ‘::’ and ‘:::’ operators must be listed here, or in ‘Suggests’ or ‘Enhances’
Original:
From the R packages documentation:
Adding a package dependency here [the DESCRIPTION file] ensures that it’ll be installed. However, it does not mean that it will be attached along with your package (i.e., library(x)). The best practice is to explicitly refer to external functions using the syntax package::function(). This makes it very easy to identify which functions live outside of your package. This is especially useful when you read your code in the future.

Related

Use ::: or export everything when developing multiple, related R packages?

I am working on a family of R packages, all of which share substantial common code which is housed in an internal package, lets call it myPackageUtilities. So I have several packages
myPackage1, myPackage2, etc...
All of these packages depend on every method in myPackageUtilities. For a real-world example, please see statnet on CRAN. Each dependent package, of course, has
Depends: myPackageUtilities
in its DESCRIPTION file.
My question is: In the R code for myPackage1, which of the following two techniques for accessing methods from myPackageUtilities is preferable:
Use ::: to access the methods in myPackageUtilities, or
Export everything from myPackageUtilities (e.g. by including exportPattern("^[^\\.]") in the NAMESPACE)?
Option 2 clutters the end-user's search path, but the R gurus recommend against using :::.
Follow-up question: If (2) is the better choice, is there a way to export everything using roxygen2?
Suppose we have a package called randomUtils and this package has a function called sd that calculates the Slytherin Defiance Quotient.
Now I write a package called spellbound. If spellbound Depends on randomUtils, then randomUtils::sd will be found in the search path and can conflict with calculating standard deviation.
If spellbound Imports randomUtils, however, then R will install randomUtils but will not load it when spellbound is loaded. Thus, the new version of sd can't be found on the search path, but can still be accessed by randomUtils::sd
With an ever growing body of contributed work on CRAN, it is becoming very important that we use Imports as much as possible so that we don't introduce unexpected behaviors by having conflicting function definitions.
An example of when I have used Depends: when writing the HydeNet package, I wanted the end user to be able to use the rjags package in concert with HydeNet. So I put rjags in Depends so that library (HydeNet) would loaf both packages. (In other words, put rjags on the search path.
Moral of the story, if you don't intend for the user to directly access the functions, it should go in Imports.

R: selective import with importFrom: namespace issues [duplicate]

The "Writing R Extensions" manual provides the following guidance on when to use Imports or Depends:
The general rules are
Packages whose namespace only is needed to load the package using library(pkgname) must be listed in the ‘Imports’ field and not in the
‘Depends’ field.
Packages that need to be attached to successfully load the package using library(pkgname) must be listed in the ‘Depends’ field, only.
Can someone provide a bit more clarity on this? How do I know when my package only needs namespaces loaded versus when I need a package to be attached? What are examples of both? I think the typical package is just a collection of functions that sometimes call functions in other packages (where some bit of work has already been coded-up). Is this scenario 1 or 2 above?
Edit
I wrote a blog post with a section on this specific topic (search for 'Imports v Depends'). The visuals make it a lot easier to understand.
"Imports" is safer than "Depends" (and also makes a package using it a 'better citizen' with respect to other packages that do use "Depends").
A "Depends" directive attempts to ensure that a function from another package is available by attaching the other package to the main search path (i.e. the list of environments returned by search()). This strategy can, however, be thwarted if another package, loaded later, places an identically named function earlier on the search path. Chambers (in SoDA) uses the example of the function "gam", which is found in both the gam and mgcv packages. If two other packages were loaded, one of them depending on gam and one depending on mgcv, the function found by calls to gam() would depend on the order in which they those two packages were attached. Not good.
An "Imports" directive should be used for any supporting package whose functions are to be placed in <imports:packageName> (searched immediately after <namespace:packageName>), instead of on the regular search path. If either one of the packages in the example above used the "Imports" mechanism (which also requires import or importFrom directives in the NAMESPACE file), matters would be improved in two ways. (1) The package would itself gain control over which mgcv function is used. (2) By keeping the main search path clear of the imported objects, it would not even potentially break the other package's dependency on the other mgcv function.
This is why using namespaces is such a good practice, why it is now enforced by CRAN, and (in particular) why using "Imports" is safer than using "Depends".
Edited to add an important caveat:
There is one unfortunately common exception to the advice above: if your package relies on a package A which itself "Depends" on another package B, your package will likely need to attach A with a "Depends directive.
This is because the functions in package A were written with the expectation that package B and its functions would be attached to the search() path.
A "Depends" directive will load and attach package A, at which point package A's own "Depends" directive will, in a chain reaction, cause package B to be loaded and attached as well. Functions in package A will then be able to find the functions in package B on which they rely.
An "Imports" directive will load but not attach package A and will neither load nor attach package B. ("Imports", after all, expects that package writers are using the namespace mechanism, and that package A will be using "Imports" to point to any functions in B that it need access to.) Calls by your functions to any functions in package A which rely on functions in package B will consequently fail.
The only two solutions are to either:
Have your package attach package A using a "Depends" directive.
Better in the long run, contact the maintainer of package A and ask them to do a more careful job of constructing their namespace (in the words of Martin Morgan in this related answer).
Hadley Wickham gives an easy explanation (http://r-pkgs.had.co.nz/namespace.html):
Listing a package in either Depends or Imports ensures that it’s
installed when needed. The main difference is that where Imports just
loads the package, Depends attaches it. There are no other
differences. [...]
Unless there is a good reason otherwise, you should always list
packages in Imports not Depends. That’s because a good package is
self-contained, and minimises changes to the global environment
(including the search path). The only exception is if your package is
designed to be used in conjunction with another package. For example,
the analogue package builds on top of vegan. It’s not useful without
vegan, so it has vegan in Depends instead of Imports. Similarly,
ggplot2 should really Depend on scales, rather than Importing it.
Chambers in SfDA says to use 'Imports' when this package uses a 'namespace' mechanism and since all packages are now required to have them, then the answer might now be always use 'Imports'. In the past packages could have been loaded without actually having namespaces and in that case you would need to have used Depends.
Here is a simple question to help you decide which to use:
Does your package require the end user to have direct access to the functions of another package?
NO -> Imports (most common answer)
YES -> Depends
The only time you should use 'Depends' is when your package is an add-on or companion to another package, where your end user will be using functions from both your package and the 'Depends' package in their code. If your end user will only be interfacing with your functions, and the other package will only be doing work behind the scenes, then use 'Imports' instead.
The caveat to this is that if you add a package to 'Imports', as you usually should, your code will need to refer to functions from that package, using the full namespace syntax, e.g. dplyr::mutate(), instead of just mutate(). It makes the code a little clunkier to read, but it’s a small price to pay for better package hygiene.

How to handle dependencies (`Depends:`) of imported packages (`Imports:`)

I'm trying to use Imports: instead of Depends: in the DESCRIPTION files of my packages, yet I still feel I've got some more to understand on this ;-)
What I learned from this post (by the way: awesome post!!!) is that everything my package, say mypkg, imports (say imported.pkg) via Imports: lives in environment imports:mypkg instead of being attached to the search path. When trying to find foo that ships with imported.pkg, R looks in imports:mypkg before traversing the search list. So far, so good.
Actual question
If imported.pkg (imported by mypkg) depends on a certain other package (stated in Depends: section of the package's DESCRIPTION file), do I need to make this very package a Depends: dependency of my package in order for R to find functions of that package? So it seems to me at the moment as otherwise R complains.
Evidence
Seems like simply importing such a package is not enough. As an example, take package roxygen2 (CRAN). It depends on digest while importing a bunch of other packages. I imported it (along with digest as mypkg also needs it) and checked environment imports:mypkg which does list the digest function: "digest" %in% parent.env(asNamespace("mypkg")) returns TRUE
Yet when running roxygenize() from within a function that is part of mypkg, R complains that it can't find digest.
You could have a look at my blog : http://r2d2.quartzbio.com/posts/package-depends-dirty-hack-solution.html
Now i have a better and cleaner solution but not published yet.
Hope it helps.

Better explanation of when to use Imports/Depends

The "Writing R Extensions" manual provides the following guidance on when to use Imports or Depends:
The general rules are
Packages whose namespace only is needed to load the package using library(pkgname) must be listed in the ‘Imports’ field and not in the
‘Depends’ field.
Packages that need to be attached to successfully load the package using library(pkgname) must be listed in the ‘Depends’ field, only.
Can someone provide a bit more clarity on this? How do I know when my package only needs namespaces loaded versus when I need a package to be attached? What are examples of both? I think the typical package is just a collection of functions that sometimes call functions in other packages (where some bit of work has already been coded-up). Is this scenario 1 or 2 above?
Edit
I wrote a blog post with a section on this specific topic (search for 'Imports v Depends'). The visuals make it a lot easier to understand.
"Imports" is safer than "Depends" (and also makes a package using it a 'better citizen' with respect to other packages that do use "Depends").
A "Depends" directive attempts to ensure that a function from another package is available by attaching the other package to the main search path (i.e. the list of environments returned by search()). This strategy can, however, be thwarted if another package, loaded later, places an identically named function earlier on the search path. Chambers (in SoDA) uses the example of the function "gam", which is found in both the gam and mgcv packages. If two other packages were loaded, one of them depending on gam and one depending on mgcv, the function found by calls to gam() would depend on the order in which they those two packages were attached. Not good.
An "Imports" directive should be used for any supporting package whose functions are to be placed in <imports:packageName> (searched immediately after <namespace:packageName>), instead of on the regular search path. If either one of the packages in the example above used the "Imports" mechanism (which also requires import or importFrom directives in the NAMESPACE file), matters would be improved in two ways. (1) The package would itself gain control over which mgcv function is used. (2) By keeping the main search path clear of the imported objects, it would not even potentially break the other package's dependency on the other mgcv function.
This is why using namespaces is such a good practice, why it is now enforced by CRAN, and (in particular) why using "Imports" is safer than using "Depends".
Edited to add an important caveat:
There is one unfortunately common exception to the advice above: if your package relies on a package A which itself "Depends" on another package B, your package will likely need to attach A with a "Depends directive.
This is because the functions in package A were written with the expectation that package B and its functions would be attached to the search() path.
A "Depends" directive will load and attach package A, at which point package A's own "Depends" directive will, in a chain reaction, cause package B to be loaded and attached as well. Functions in package A will then be able to find the functions in package B on which they rely.
An "Imports" directive will load but not attach package A and will neither load nor attach package B. ("Imports", after all, expects that package writers are using the namespace mechanism, and that package A will be using "Imports" to point to any functions in B that it need access to.) Calls by your functions to any functions in package A which rely on functions in package B will consequently fail.
The only two solutions are to either:
Have your package attach package A using a "Depends" directive.
Better in the long run, contact the maintainer of package A and ask them to do a more careful job of constructing their namespace (in the words of Martin Morgan in this related answer).
Hadley Wickham gives an easy explanation (http://r-pkgs.had.co.nz/namespace.html):
Listing a package in either Depends or Imports ensures that it’s
installed when needed. The main difference is that where Imports just
loads the package, Depends attaches it. There are no other
differences. [...]
Unless there is a good reason otherwise, you should always list
packages in Imports not Depends. That’s because a good package is
self-contained, and minimises changes to the global environment
(including the search path). The only exception is if your package is
designed to be used in conjunction with another package. For example,
the analogue package builds on top of vegan. It’s not useful without
vegan, so it has vegan in Depends instead of Imports. Similarly,
ggplot2 should really Depend on scales, rather than Importing it.
Chambers in SfDA says to use 'Imports' when this package uses a 'namespace' mechanism and since all packages are now required to have them, then the answer might now be always use 'Imports'. In the past packages could have been loaded without actually having namespaces and in that case you would need to have used Depends.
Here is a simple question to help you decide which to use:
Does your package require the end user to have direct access to the functions of another package?
NO -> Imports (most common answer)
YES -> Depends
The only time you should use 'Depends' is when your package is an add-on or companion to another package, where your end user will be using functions from both your package and the 'Depends' package in their code. If your end user will only be interfacing with your functions, and the other package will only be doing work behind the scenes, then use 'Imports' instead.
The caveat to this is that if you add a package to 'Imports', as you usually should, your code will need to refer to functions from that package, using the full namespace syntax, e.g. dplyr::mutate(), instead of just mutate(). It makes the code a little clunkier to read, but it’s a small price to pay for better package hygiene.

Upcoming NAMESPACE, Depends, Imports changes for 2.14.0 (some definitions/use please)

If you are a package author, you are hopefully well aware of upcoming changes in package structure when we move to 2.14 in about a week. One of the changes is that all packages will require a NAMESPACE, and one will be generated for you in the event you do not make one (the R equivalent of your Miranda rights in the US). So being good citizen I was trying to figure this out. Here is the section from R-exts:
1.6.5 Summary – converting an existing package
To summarize, converting an existing package to use a namespace
involves several simple steps:
Identify the public definitions and place them in export directives.
Identify S3-style method definitions and write corresponding S3method
declarations. Identify dependencies and replace any require calls by
import directives (and make appropriate changes in the Depends and
Imports fields of the DESCRIPTION file). Replace .First.lib functions
with .onLoad functions or useDynLib directives.
To ensure I do the right thing here, can someone give a short clear definition/answer (am I breaking a rule by having several small but related questions together?). All answers should take 2.14 into account, please:
A definition of NAMESPACE as used by R
Is there a way to generate a NAMESPACE prior to build and check, or do we b/c once and then edit the NAMESPACE created automatically?
The difference between "Depends:" and "Imports:" in the DESCRIPTION file. In particular, why would a I put a package in "Depends:" instead of the "Imports:" or vice versa?
It sounds like "require" is no longer to be used, though it doesn't say that. Is this the correct interpretation?
Thanks!
I've written a little on this topic at https://github.com/hadley/devtools/wiki/Namespaces.
To answer your questions:
See Dirk's answer.
Use roxygen2
Now that every package has a namespace there is little reason to use Depends.
require should only be used to load suggested packages
CRAN packages have had NAMESPACEs since almost time immortal. Just pick a few of your favorite CRAN packages and look at their NAMESPACE files.
It can be as easy as this one-liner (plus comment) taken from snow:
# Export all names unless they start with a dot
exportPattern("^[^.]")
The run R CMD check as usual, and you should be fine in most circumstances.
I'm going to answer my own question with a few details I've learned after switching several packages over to R 2.14.
The description above from the manual sort of gives the impression that whatever you had in Depends: for R 2.13 should be moved over to Imports: in R 2.14. You should do that, but they are not 1-for-1 functionally the same, as I hope will be clear from the notes below.
Here we go:
Depends: should probably be used only for restrictions on versions, like 'R >= 2.10' or 'MASS > 0.1' and nothing else under R 2.14.
Having a namespace is partly a mechanism of notifying users that there may be name conflicts and "replacements" -- in other words overwriting of names in use. The NAMESPACE file must match in items and in order the Imports: field in DESCRIPTION. Function names etc imported will be listed under "loaded via namespace (and not attached)" in sessionInfo(). Those packages are installed but not loaded (i.e. no library(some imported package)).
The other role of a namespace is to make functions available to your package "internally". By that, I mean that if your package uses a function in an imported package, it will be found.
However, when you have an example in an .Rd file to be run during checking, packages that you used to have under Depends: in R 2.13 but are now in Imports: under R 2.14 are not available. This is because the checking environment is pretty much like sourcing a script in a clean environment (assuming you are using R --vanilla so .Rprofiles etc have not been run). Unless you put a library(needed package) statement in your example, it won't work under R 2.14 even if it did under R 2.13. So the old examples do not necessarily run even though your package Imports: the needed packages, because again Imports: is not quite the same as Depends: (strictly, they are attached but not loaded).
Please do correct me if any of this is wrong. Many thanks to Hadley Wickham and others who helped me along!
I recently worked on this for one of my packages. Here are my new Depends, Imports, and Suggests lines
Depends: R (>= 2.15.0)
Imports: nlme, mvtnorm, KFAS (>= 0.9.11), stats, utils, graphics
Suggests: Hmisc, maps, xtable, stringr
stats, utils and graphics are part of base R, but the user could detach them and then my package wouldn't work. If you use R from the command line, you might think 'why would anyone detach those?'. But if a user is using RStudio, say, I could see them going through and 'unclicking' all the packages. Strange thing to do, nonetheless, I don't want my package to stop working if they do that. Or they might redefine, say, the plot function (or some other function) , and then my package would fail.
My NAMESPACE then has the following lines
import(KFAS)
import(stats)
import(utils)
import(graphics)
I don't want to go through and keep track of which stats, utils and graphics functions I use, so I import their whole namespaces. For KFAS, I only need 2 functions, but the exported function names changed between versions so I import the whole namespace and then in my code test which version the user has.
For mvtnorm and nlme, I only use one function so I import just those. I could import the whole namespace, but try to only import what I really use.
importFrom(mvtnorm, rmvnorm)
importFrom(nlme, fdHess)
The vignettes where the Suggests packages appear have
require(package)
lines in them.
For the exported functions in my NAMESPACE, I'm a little torn. CRAN has become strict in not allowing ::: in your package code. That means that if I don't export a function, I am restricting creative re-use. On the other hand, I understand the need to only export functions that you intend to maintain with a stable arg list and output otherwise we break each other's packages by changing the function interface.

Resources