How do I load data from another package from within my package - r

One of the functions in a package that I am developing uses a data set from the acs:: package (the fips.state object). I can load this data into my working environment via
data(fips.state, package = "acs"),
but I do not know the proper way to load this data for my function. I have tried
#importFrom acs fips.state,
but data sets are not exported. I do not want to copy the data and save it to my package because this seems like a poor development practice.
I have looked in http://r-pkgs.had.co.nz/namespace.html, http://kbroman.org/pkg_primer/pages/docs.html, and https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Data-in-packages, but they do not include any information on sharing data sets from one package to another.
Basically, how do I make a data set, that is required by functions in another package available to the functions in my package?

If you don't have control over the acs package, then acs::fips.state seems to be your best bet, as suggested by #paleolimbot.
If you are going to make frequent calls to fips.state, then I'd suggest making a local copy via fips.state <- acs::fips.state, as there is a small cost in looking up objects from other packages that you might do well to avoid incurring multiple times.
But if you are able to influence the acs package (even if you are not, I think this is a useful generalization), then mikefc suggests an alternative solution, which is to set the fips.state object up as internal to the package, and then to export it:
usethis::use_data(fips.state, other.data, internal = FALSE)
And then in NAMESPACE:
export(fips.state)
or if using roxygen2:
#' Fips state
#' #name fips.state
#' #export
"fips.state"
Then in your own package, you can simply #importFrom acs fips.state.

You can always use package::object_name (e.g., dplyr::starwars) anywhere in your package code, without using an import statement.
is_starwars_character <- function(character) {
character %in% dplyr::starwars$name
}
is_starwars_character("Luke Skywalker")
#> [1] TRUE
is_starwars_character("Indiana Jones")
#> [1] FALSE

Related

Loading libraries for use in an specific environment in R

I have written some functions to facilitate repeated tasks among my R projects. I am trying to use an environment to load them easily but also prevent them from appearing when I use ls() or delete them with rm(list=ls()).
As a dummy example I have an environment loader function in a file that I can just source from my current project and an additional file for each specialized environment I want to have.
currentProject.R
environments/env_loader.R
environments/colors_env.R
env_loader.R
.environmentLoader <- function(env_file, env_name='my_env') {
sys.source(env_file, envir=attach(NULL, name=env_name))
}
path <- dirname(sys.frame(1)$ofile) # this script's path
#
# Automatically load
.environmentLoader(paste(path, 'colors_env.R', sep='/'), env_name='my_colors')
colors_env.R
library(RColorBrewer) # this doesn't work
# Return a list of colors
dummyColors <- function(n) {
require(RColorBrewer) # This doesn't work
return(brewer.pal(n, 'Blues'))
}
CurrentProject.R
source('./environments/env_loader.R')
# Get a list of 5 colors
dummyColors(5)
This works great except when my functions require me to load a library. In my example, I need to load the RColorBrewer library to use the brewer.pal function in colors_env.R, but the way is now I just get an error Error in brewer.pal(n, "Blues") : could not find function "brewer.pal".
I tried just using library(RColorBrewer) or using require inside my dummyColors function or adding stuff like evalq(library("RColorBrewer"), envir=parent.env(environment())) to the colors_env.R file but it doesn't work. Any suggestions?
If you are using similar functions across projects, I would recommend creating an R package. It's essentially what you're doing in many ways, but you don't have reinvent a lot of the loading mechanisms, etc. Hadley Wickham's book R Packages is very good for this topic. It doesn't need to be a completely fully built out, CRAN ready sort of thing. You can just create a personal package with misc. functions you frequently use.
That being said, the solution for your specific question would be to explicitly use the namespace to call the function.
dummyColors <- function(n) {
require(RColorBrewer) # This doesn't work
return(RColorBrewer::brewer.pal(n, 'Blues'))
}
Create a package and then run it. Use kitten to build the boilerplate, copy your file to it, optionally build it if you want a .tar.gz file or omit that step if you don't need it and finally install it. Then test it out. We have assumed colors_env.R, shown in the question, is in current directory.
(Note that require should always be within an if so that if it does not load then the error is caught. If not within an if use library which will guarantee an error message in that case.)
# create package
library(devtools)
library(pkgKitten)
kitten("colors")
file.copy("colors_env.R", "./colors/R")
build("colors") # optional = will create colors_1.0.tar.gz
install("colors")
# test
library(colors)
dummyColors(5)
## Loading required package: RColorBrewer
## [1] "#EFF3FF" "#BDD7E7" "#6BAED6" "#3182BD" "#08519C"

Can our R package depend on an invisible `pkg:::function()` from another package [duplicate]

I am creating an R package in RStudio. Say I have two functions fnbig() and fnsmall() in my package named foo. fnbig() is a function that must be accessible to the user using the package. fnsmall() is an internal function that must not accessible to the user but should be accessible inside of fnbig().
# package code
fnsmall <- function()
{
bla bla..
}
#' #export
fnbig <- function()
{
bla bla..
x <- fnsmall()
bla..
}
I have tried exporting fnsmall(). All works but it litters the NAMESPACE. I tried not exporting fnsmall(), but then it doesn't work inside fnbig() when using x <- fnsmall() or x <- foo::fnsmall(). Then I tried to use x <- foo:::fnsmall(), and it works. But I read that using :::is not recommended.
What is the best way to go about doing this? How do I call an internal function from an exported function?
But I read that using :::is not recommended.
I think you mention this based on the following statement in the manual for writing packages by R.
Using foo:::f instead of foo::f allows access to unexported objects. This is generally not recommended, as the semantics of unexported objects may be changed by the package author in routine maintenance.
The reason for this to be not recommended is that unexported functions have no documentation and as such have no guarantee from the side of the package author that they will keep doing what they do now.
However, since you are referring to your own unexported functions, you have full control of what is happening in those functions, so this objection is not as relevant.
Referring to it as foo:::fnsmall is therefore a good option.

Access list of functions and metadata for github dev R package

Please Note: This is cross-posted from here where it hasn't received a response.
So I'm adding it here.
I'm currently co-developing an R package on github which can be installed using devtools::install_github('repo/pkgname), as usual.
We have diligently used roxygen2 to document the individual functions.
We have split the functions into "internal" (#keywords internal) vs. "external" (#export)
so that the user gets to use the external functions i.e.pkgname::external_<fn_name>
and access documentation. They can also use ::: to access the internal
functions if they wish.
For some meta analysis of our package it would be nice to have a functionality
that produced a tidy tibble with the following columns:
function name,
function type i.e. internal/external (accessible by :: or ::: to the user)
More metadata e.g. another column containing parameter names for each function i.e. #param values
documentation strings for each parameter
As a crude version (non-tibble format) for say dplyr. One can do something like:
library(dplyr) # Assume installed already
ls('package:dplyr')
This produces a character vector of function names, but not a tidy tibble with more
useful metadata.
Ideally we would be able to produce this tibble after doing devtools::load_all(".") in
our package development, to track changes in real-time.
Are there any existing R packages that can help generate such a metadata tibble?
Or can such a function be developed for this using existing R packages?
Would appreciate any help associated with this.
I have an answer to my question, which may help others. It turns out this metadata can be accessed using the amazing pkgdown package.
See below for code to use when you have opened an RStudio project attached to a package
you are developing (using devtools):
# Setup - install required libraries
# install.packages(c("pkgdown", "here"))
# If you are in your local package directory, run the following
# to get the required package metadata
pkg <- pkgdown::as_pkgdown(pkg = here::here())
# Inspect the topics object, which contains function metadata
pkg$topics %>% dplyr::glimpse()
# Get list of all functions and just required metadata
pkg_fns_all <- pkg$topics %>%
dplyr::select(name, file_in, internal)
# Get the non-internal functions, acccessed using pkgname::function
pkg_fns_user <- pkg_fns_all %>% dplyr::filter(!internal)
# Get the internal functions, acccessed using pkgname:::function
pkg_fns_internal <- pkg_fns_all %>% dplyr::filter(internal)
Hope this helps others :slight_smile:
A few small outstanding items:
I'm not sure how to get access to individual function #param values from the
above, but if anyone can add some details around that it would be useful.
I'm not sure how to apply this to CRAN installed packages on my system e.g. dplyr

R: How to check if the libraries that I am loading, I use them for my code in R?

I have used several packages of R libraries for my study. All libraries charge together at the beginning of my code. And here is the problem. It turns out that I have done several tests with different functions that were already in the packages of R. However, in the final code I have not implemented all the functions I have tried. Therefore, I am loading libraries that I do not use.
Would there be any way to check the libraries to know if they really are necessary for my code?
Start by restarting R with a fresh environment, no libraries loaded. For this demonstration, I'm going to define two functions:
zoo1 <- function() na.locf(1:10)
zoo2 <- function() zoo::na.locf(1:10)
With no libraries loaded, let's try something:
codetools::checkUsage(zoo1)
# <anonymous>: no visible global function definition for 'na.locf'
codetools::checkUsage(zoo2)
library(zoo)
# Attaching package: 'zoo'
# The following objects are masked from 'package:base':
# as.Date, as.Date.numeric
codetools::checkUsage(zoo1)
Okay, so we know we can check a single function to see if it is abusing scope and/or using non-base functions. Let's assume that you've loaded your script full of functions (but not the calls to require or library), so let's do this process for all of them. Let's first unload zoo, so that we'll see a complaint again about our zoo1 function:
detach("package:zoo", unload=TRUE)
Now let's iterate over all functions:
allfuncs <- Filter(function(a) is.function(get(a)), ls())
str(sapply(allfuncs, function(fn) capture.output(codetools::checkUsage(get(fn))), simplify=FALSE))
# List of 2
# $ zoo1: chr "<anonymous>: no visible global function definition for 'na.locf'"
# $ zoo2: chr(0)
Now you know to look in the function named zoo1 for a call to na.locf. It'll be up to you to find in which not-yet-loaded package this function resides, but that might be more more reasonable, depending on the number of packages you are loading.
Some side-thoughts:
If you have a script file that does not have everything comfortably ensconced in functions, then just wrap all of the global R code into a single function, say bigfunctionfortest <- function() { as the first line and } as the last. Then source the file and run codetools::checkUsage(bigfunctionfortest).
Package developers have to go through a process that uses this, so that the Imports: and Depends: sections of NAMESPACE (another ref: http://r-pkgs.had.co.nz/namespace.html) will be correct. One good trick to do that will prevent "namespace pollution" is loading the namespace but not the package ... and though that may sound confusing, it often results in using zoo::na.locf for all non-base functions. This gets old quickly (especially if you are using dplyr and such, where most of your daily functions are non-base), suggesting those oft-used functions should be directly imported instead of just referenced wholly. If you're familiar with python, then:
# R
library(zoo)
na.locf(c(1,2,NA,3))
is analagous to
# fake-python
from zoo import *
na_locf([1,2,None,3])
(if that package/function exists). Then the non-polluting variant looks like:
# R
zoo::na.locf(c(1,2,NA,3))
# fake-python
import zoo
zoo.na_locf([1,2,None,3])
where the function's package (and/or subdir packaging) must be used explicitly. There is no ambiguity. It is explicit. This is by some/many considered "A Good Thing (tm)".
(Language-philes will likely say that library(zoo) and from zoo import * are not exactly the same ... a better way to describe what is happening is that they bring everything from zoo into the search path of functions, potentially causing masking as we saw in a console message earlier; while the :: functionality only loads the namespace but does not add it to the search path. Lots of things going on in the background.)

How to manage internal utility functions shared by two R packages

I am developing an R package but I would like to break it down into two packages,
say A and B, where B depends on A.
In the course of development I have created a number of internal utility
functions, say .util1(), .util2(), etc. They are useful to keep
my code tidy and avoid repetitions, but I don't want to export them and make them
available to other users.
Rather than having one copy of these functions in both A and B, my idea was to put all of them in package A, and then access them from A using B:::.util1(), ... etc. On the other hand that doesn't look very neat, and I will have to document all these "hidden" dependencies somewhere (given that I will not explicitly export them from A). Are there other alternatives? Thanks!
How about this, using the "zoo" package and its internal variable ".packageName" for illustration purpose. You may replace them with the names of your package and internal variable/function when testing.
library(zoo) # Load a library
zoo:::.packageName # Access an internal variable
.packageName # A test - Fail to call without the Namespace
pkg.env <- getNamespace("zoo") # Store the Namespace
attach(pkg.env) # Attach it
.packageName # Succeed to call directly !
detach(pkg.env) # Detach it afterward
(Edited)
## To export an internal object to the current Namespace (without "attach")
assign(".packageName",get(".packageName",envir=pkg.env))
## Or using a loop if you have a few of internal objects to export
for (obj_name in a_list_of_names) {
assign(obj_name,get(obj_name,envir=pkg.env))
}

Resources