R: how to properly include libraries in my package

R: how to properly include libraries in my package - r

I'm writing by first R-Package and was wondering what the right way is to include other libraries, like ggplot2.
So there are two places where an import statement can go in my package. The DESCRIPTION file and the NAMESPACE, where the latter requires to include ggplot in some roxagen statement like #'#import ggplot2.
Actually I thought that is is enough to include ggplot2 inside DESCRIPTION, as I thought that loading my packages will also load or dependencies, however, when ggplot2 is not in the namespace it seems like I cannot use function, e.g. aes, ggplot without writing ggplot2::aes or ggplot2::gpplot. So for what are the import statements in each of the config files, i.e. DESCRIPTION and NAMESPACE?

The one required step is to include ggplot2 under Imports: in your DESCRIPTION file. This ensures that ggplot2 is installed when your package is installed.
Then, to use functions from ggplot2 in your package, you need to tell R where to look for them. You can do this in 3 ways:
Put the name of the package before the function name when you use it: ggplot2::aes(). This is my preferred method, as it doesn’t add anything extra to your NAMESPACE, avoids any name collisions, and avoids breakage if you later rework or remove a function.
Import specific functions as you use them. Do this by putting #’ #importFrom ggplot2 aes in the roxygen block above the function using it. Add however many function names you need to that line. After doing that, you can use aes() directly without having to specify the namespace it’s coming from with ggplot2::aes(). This can be convenient if you’re working with a lot of functions (like with ggplot2), but I recommend using the first option instead in general to keep your NAMESPACE tidier. You technically only need to include the #importFrom line for once in your package for any function, but I recommend including it for each function that is using the imported functions. That way, if you ever rework or remove a function, the other functions don’t break.
You can import a whole package of functions by putting #’ #import ggplot2 in a roxygen block. This imports every function from a package. This can be a lot for big packages like ggplot2 and dplyr, and the can potentially cause issues, so I suggest never doing this and instead importing only the specific functions you need or calling them with ggplot2::aes().

If you depend on a package you should put it in the Imports field of the DESCRIPTION file, after which you can use pkgname::function() in your code.
usethis::use_package() function can help you do this.
If you want your code to be able to use any code of the package without the use of ::, you should put a roxygen comment somewhere like this:
#' #import pkgname
NULL
This then gets ported by roxygen2 to your NAMESPACE file.
If you want to specifically use some functions (but not others), you can use the following that is used by roxygen2:
#' #importFrom pkgname fun1 fun2
NULL
The usethis::use_import_from() function can help you do the above. In the examples above NULL only indicates that you're not documentating a function or data, and you can use it at the end of a documentation comment block.

Related

How can I use something like `package::%to%` in a function/package

I want to use functions from the expss package in my own functions/packages. I usually call the functions along with their packages (e.g. dplyr::mutate(...)).
The expss package has a function/operator %to%, and I don't know how I can do the same here, i.e. expss::%to% doesn't work, neither does expss::'%to%'.
What can I do?

Infix operators must be attached to be usable; you can’t use them prefixed with the package name.1
Inside a package, the conventional way is to add an importFrom directive to your NAMESPACE file or, if you’re using ‘roxygen2’, add the following Roxygen directive somewhere:
#' #importFrom expss %to%
Outside of package code, you could use ‘box’ to attach just the operator:
box::use(expss[`%to%`])
Or you can use simple assignment (this is the easiest solution in the simplest case but it becomes a lot of distracting code for multiple operators):
`%to%` = expss::`%to%`
1 Except using regular function call syntax:
expss::`%to%`(…)

Why library() or require() should not be used in a R package

My goal is to create R package which use other library such as grid and ggplot2.
According to
https://tinyheero.github.io/jekyll/update/2015/07/26/making-your-first-R-package.html, it is said that library() or require() should not be used in a R package.
My questions are:
1)Is there a reason? (because, although I put library("ggplot2") and library("grid") in my R script in my package, it still worked).
2)Do I have to delete library("ggplot2") and library("grid") in my code and put "::" such as ggplot2::geom.segment()?
Is there an efficient way to convert script to the one for package?

You should never use library() or require() in a package, because they affect the user's search list, possibly causing errors for the user.
For example, both the dplyr and stats packages export a function called filter. If a user had only library(stats), then filter would mean stats::filter, but if your package called library(dplyr), the user might suddenly find that filter means dplyr::filter, and things would break.
There are a couple of alternatives for your package. You can import functions from another package by listing it in the Imports: field in the DESCRIPTION file and specifying the imports in the NAMESPACE file. (The roxygen2 package can make these changes for you automatically if you put appropriate comments in your .R source files, e.g.
#' #importFrom jsonlite toJSON unbox
before a function that uses those to import toJSON() and unbox() from the jsonlite package.)
The other way to do it is using the :: notation. Then you can still list a package in the Imports: field of DESCRIPTION, but use code like
jsonlite::toJSON(...)
every time you want to call it. Alternatively, if you don't want a strong dependence on jsonlite, you can put jsonlite in Suggests:, and wrap any uses of it in code like
if (requireNamespace("jsonlite")) {
jsonlite::toJSON(...)
}
Then people who don't have that package will still be able to run your function, but it may skip some operations that require jsonlite.

Alternatives to placing package in "Depends" section

I'm writing a small package that builds some custom types of graphs using ggplot2. Naturally, my source files are going to be littered with ggplot2 functions. I'm somewhat new to package development, and my understanding is that it's generally better to disambiguate namespaces using :: within package sources. But putting ggplot2:: in front of everything seems like a great recipe for cluttering my code - I'd like to make it as readable and clear as possible to make it easier for my colleagues to work on my code as well.
Is there a way to give my source files access to the ggplot2 namespace? Using library within a package seems to be a big no-no. Putting ggplot2 under "Depends" in the package DESCRIPTION almost does it, but only attaches ggplot2 when I attach my package (thus causing problems if my package is loaded but not attached). Finding a way to automatically attach ggplot2 when my package is loaded would solve those problems, though intuition is telling me this is probably a bad practice somehow.

As mentioned here, you can do this in the roxygen comments:
If you are using many functions from another package, use #import package to import them all and make available without using ::.
Preferably you would put this in the R/packagename-package.R file, that has other standard roxygen tags, like so:
#' #docType package
#' #name packagename
#' #import ggplot2
NULL

R with roxygen2: How to use a single function from another package?

I'm creating an R package that will use a single function from plyr. According to this roxygen2 vignette:
If you are using just a few functions from another package, the
recommended option is to note the package name in the Imports: field
of the DESCRIPTION file and call the function(s) explicitly using ::,
e.g., pkg::fun().
That sounds good. I'm using plyr::ldply() - the full call with :: - so I list plyr in Imports: in my DESCRIPTION file. However, when I use devtools::check() I get this:
* checking dependencies in R code ... NOTE
All declared Imports should be used:
‘plyr’
All declared Imports should be used.
Why do I get this note?
I am able to avoid the note by adding #importFrom dplyr ldply in the file that is using plyr, but then I end but having ldply in my package namespace. Which I do not want, and should not need as I am using plyr::ldply() the single time I use the function.
Any pointers would be appreciated!
(This question might be relevant.)

If ldply() is important for your package's functionality, then you do want it in your package namespace. That is the point of namespace imports. Functions that you need, should be in the package namespace because this is where R will look first for the definition of functions, before then traversing the base namespace and the attached packages. It means that no matter what other packages are loaded or unloaded, attached or unattached, your package will always have access to that function. In such cases, use:
#importFrom plyr ldply
And you can just refer to ldply() without the plyr:: prefix just as if it were another function in your package.
If ldply() is not so important - perhaps it is called only once in a not commonly used function - then, Writing R Extensions 1.5.1 gives the following advice:
If a package only needs a few objects from another package it can use a fully qualified variable reference in the code instead of a formal import. A fully qualified reference to the function f in package foo is of the form foo::f. This is slightly less efficient than a formal import and also loses the advantage of recording all dependencies in the NAMESPACE file (but they still need to be recorded in the DESCRIPTION file). Evaluating foo::f will cause package foo to be loaded, but not attached, if it was not loaded already—this can be an advantage in delaying the loading of a rarely used package.
(I think this advice is actually a little outdated because it is implying more separation between DESCRIPTION and NAMESPACE than currently exists.) It implies you should use #import plyr and refer to the function as plyr::ldply(). But in reality, it's actually suggesting something like putting plyr in the Suggests field of DESCRIPTION, which isn't exactly accommodated by roxygen2 markup nor exactly compliant with R CMD check.
In sum, the official line is that Hadley's advice (which you are quoting) is only preferred for rarely used functions from rarely used packages (and/or packages that take a considerable amount of time to load). Otherwise, just do #importFrom like WRE advises:
Using importFrom selectively rather than import is good practice and recommended notably when importing from packages with more than a dozen exports.

R function without exporting in Namespace

I am writing an R package. Generally, I have some functions that they are not useful for external uses. So when I put them in Namespace file, it causes an error about documentation of functions. On the other hand, if I remove them from Namespace file, it causes another problem, Function not found. So, is there any way of calling a function without a need of writing documentations?

As Andrie commented if you want to include the function in the R package you need to put it inside a folder (e.g. packageparent/R/) and declare in NAMESPACE. You do not put a function in NAMESPACE.
IF you do not want to include it in your package, none of your functions in your package shall call this function, otherwise the package does not compile. You still can include this function in your package and not write any documentation for it.
To use this function outside your package just source it