Devtools Document throwing Error - r

I am putting together an R data package, and I have been documenting the datasets without issue until now. The following is included in a file called charges_ay.R located in the R folder in package repo.
#' Student Charges for Academic Year programs.
#'
#' For more information, download a data dictionary from the IPEDS website.
#'
#' Survey years 2002 - 2014.
#'
#' #source http://nces.ed.gov/ipeds/datacenter/DataFiles.aspx
#' #format Data frame with columns
"charges_ay"
When I attempt to run devtools::document from the base of the package (as I have for the other files), I get the following error:
> devtools::document()
Updating ripeds documentation
Loading ripeds
Error: 'charges_ay' is not an exported object from 'namespace:ripeds'
Given that everything has worked fine until now, I am bit confused as the process and file documentation are all the same.
Any help will be greatly appreciated!

When I ran in to this in my own package, it seemed to be a workflow issue. Try either running use_data(charges_ay) prior to document(), or adding use_data(charges_ay) at the end of your data-generating file.

Related

Data not found in documentation with roxygen2

I am working with roxygen2 library and devtools. Building a package with the following structure:
Inside /data folder I have two .rda files with the information of each dataset. Let's call them data1.rda and data2.rda.
Inside /R folder I have two files, one with the functions created (and their explanation) and another one called data.R with the information of each dataset.
#' Description 1
#'
#' Simple definition
#'
#' #format The \code{data.frame} contains 2 variables:
#' \describe{
#' \item{a}{The first variable.}
#' \item{b}{The second variable.}
#' }
"data1"
When I run roxygen2::roxygenize() I get this message:
First time using roxygen2. Upgrading automatically...
Error in get(name, envir = env) : object 'data1' not found.
I have looked for similar questions, without an answer for this problem. Anyone has a suggestion?
It might be a silly question but are you running roxygenise on your loaded package? Meaning that you first run devtools::load_all(), and then roxygen2::roxygenise().
I've seen a couple of people making this mistake on other posts.
The roxygen2::roxygenize method does not load the package properly. But you can replace this step with devtools::document(package_path)
Please try adding these additional tags in your comments: #name, #docType, and #references

Does roxygen2 work for R scripts in data-raw?

I am using RStudio to create a package for a piece of data analysis I'm doing. To put my raw data into the package, I'm using devtools::use_data_raw() as per this article.
I have a script load-raw-data.R that loads the raw data and assembles it into a dataframe, then calls devtools::use_data() on this dataframe to add it to the package. load-raw-data.R is in /data-raw not /R, as per the article. I've added documentation to the functions in this script via a roxygen2 skeleton, however when I build the documentation the .Rd files for these functions are not built. I presume this is because roxygen2 is only looking in /R. Is there a way to tell roxygen2 to look in /data-raw as well? Or have I misunderstood something along the way?
Update: following #phil's suggestion
#phil - thanks - I tried this for one of the functions (load_data_files) in the load-raw-data.R script (see below for the documentation added to R/data.R), but on rebuilding the package I get an error: 'load_data_files' is not an exported object from 'namespace:clahrcnwlhf'. I have included the #export tag in the documentation in R/data.R. Any thoughts on how I might resolve this?
`# This script loads the individual component files of the raw dataset
# and stitches them together, saving the result as an .RData file
#' load_data_files
#'
#' load_data_files loads in a set of Excel files as dataframes
#'
#' #param fl list of paths of the files to be loaded
#'
#' #return A list of dataframes, one for each of the file paths in fl.
#' #export
"load_data_files"`

Referencing user-created functions in R from seperate scripts

I'm trying to re-use some code that I've already written but often need to re-execute for various projects (IE I'd like to apply some Object-Oriented principles to my R code). I know that a framework exists for publishing new packages on CRAN, but the code I have isn't something that would be valuable for other parties.
Essentially I'd like to either create my own local packages and reference them using a require() call or at the very least call functions that I've saved in separate .r files as-needed.
I've searched around online and found several lengthy articles about creating packages and compiling them using RTools (I'm on a Windows OS) but since I'm not writing C this seems overkill for my simple purposes. To offer an example of what I'm referring to, I have a script to remove unwanted characters from string data that I constantly need to copy/paste into new scripts; I don't want to do this and would prefer to just do something like require(myFunction).
Is there a simple way to solve this problem or am I best served by grabbing RTools and compiling my custom functions locally?
Creating an R package is actually super easy. The link from Alex is how I started my first package. Here's a slightly simplified version I have to give my students. (NB: full credit to Hilary Parker, the author of the original blogpost).
First install devtools and roxygen:
install.packages("devtools")
library("devtools")
install.packages("roxygen2")
library("roxygen2")
Make a new directory for your functions:
setwd("/path/to/parentdirectory")
create("mypackage")
Add your functions to a file (or files) named anything.R in the R directory. The file should look like this, you can have one function per file, or multiple:
mymeanfun <- function(x){
mean(x)
}
myfilterfun <- function(x, y){
filter(x, y)
}
Now you should document the code. You can document (and import) using roxygen. Make sure you #import functions from any other packages, and #export the functions you want available. Roxygen and devtools will take care of everything else (namespace, requires etc etc.) until you get more advanced. Everything else is optional:
#' My Mean Function
#'
#' Takes the mean
#' #param x any default data type
#' #export
#' #examples
#' mymeanfun(c(1,2,3))
mymeanfun <- function(x){
mean(x)
}
#' My Filter Function
#'
#' Identical to dplyr::filter
#' #param x a data.frame
#' #export
#' #importFrom dplyr filter
myfilterfun <- function(x, y){
filter(x, y)
}
Now run the document() from roxygen2 in the directory you created:
setwd(".\mypackage")
document()
You are now up and running - I'd recommend putting it on github and installing from there:
install_github("yourgithubname/mypackage")
From then on, you can just call:
library(mypackage)
Every time you need your functions.
For more details and better documentation practices, see Hadley's book

NA appears where title should be for R package manual pdf

I have an NA that I cannot figure out how to fix in one of the entries in the manual for my package. It is the first topic, the package itself. It should display the title of my package: "MA Birk's Functions" but instead just NA.
Here is the relevant .R code for the package description (note I'm using roxygen2):
#' MA Birk’s Functions
#'
#' This is a compilation of functions that I found useful to make. It currently includes a unit of measurement conversion function, a Q10 calculator for temperature dependence of chemical and biological rates, and some miscellaneous wrapper functions to make R code shorter and faster to write.
#'
#' #author Matthew A. Birk, \email{matthewabirk##gmail.com}
#' #docType package
#' #name birk
#' #encoding UTF-8
NULL
And the resulting .Rd file:
% Generated by roxygen2 (4.1.0): do not edit by hand
% Please edit documentation in R/birk.R
\docType{package}
\encoding{UTF-8}
\name{birk}
\alias{birk}
\alias{birk-package}
\title{MA Birk’s Functions}
\description{
This is a compilation of functions that I found useful to make. It currently includes a unit of measurement conversion function, a Q10 calculator for temperature dependence of chemical and biological rates, and some miscellaneous wrapper functions to make R code shorter and faster to write.
}
\author{
Matthew A. Birk, \email{matthewabirk#gmail.com}
}
I notice no NA appearing in any of the resulting HTML pages nor any errors in building or checking the package. But here in the manual is this NA:
I have 3 earlier versions of this package that never had this problem. For this latest version I switched from manually entering .Rd info to using roxygen2. I'm thinking the NA issue is arising from that somewhere...
I don't think the DESCRIPTION file has anything to do with it, so I did not include it in this question (too cluttered), but let me know if you suspect it and I can add that as well.

Is it possible to use R package data in testthat tests or run_examples()?

I'm working on developing an R package, using devtools, testthat, and roxygen2. I have a couple of data sets in the data folder (foo.txt and bar.csv).
My file structure looks like this:
/ mypackage
/ data
* foo.txt, bar.csv
/ inst
/ tests
* run-all.R, test_1.R
/ man
/ R
I'm pretty sure 'foo' and 'bar' are documented correctly:
#' Foo data
#'
#' Sample foo data
#'
#' #name foo
#' #docType data
NULL
#' Bar data
#'
#' Sample bar data
#'
#' #name bar
#' #docType data
NULL
I would like to use the data in 'foo' and 'bar' in my documentation examples and unit tests.
For example, I would like to use these data sets in my testthat tests by calling:
data(foo)
data(bar)
expect_that(foo$col[1], equals(bar$col[1]))
And, I would like the examples in the documentation to look like this:
#' #examples
#' data(foo)
#' functionThatUsesFoo(foo)
If I try to call data(foo) while developing the package, I get the error "data set 'foo' not found". However, if I build the package, install it, and load it - then I can make the tests and examples work.
My current work-arounds are to not run the example:
#' #examples
#' \dontrun{data(foo)}
#' \dontrun{functionThatUsesFoo(foo)}
And in the tests, pre-load the data using a path specific to my local computer:
foo <- read.delim(pathToFoo, sep="\t", fill = TRUE, comment.char="#")
bar <- read.delim(pathToBar, sep=";", fill = TRUE, comment.char="#"
expect_that(foo$col[1], equals(bar$col[1]))
This does not seem ideal - especially since I'm collaborating with others - requiring all the collaborators to have the same full paths to 'foo' and 'bar'. Plus, the examples in the documentation look like they can't be run, even though once the package is installed, they can.
Any suggestions? Thanks much.
Importing non-RData files within examples/tests
I found a solution to this problem by peering at the JSONIO package, which obviously needed to provide some examples of reading files other than those of the .RData variety.
I got this to work in function-level examples, and satisfy both R CMD check mypackage as well as testthat::test_package().
(1) Re-organize your package structure so that example data directory is within inst. At some point R CMD check mypackage told me to move non-RData data files to inst/extdata, so in this new structure, that is also renamed.
/ mypackage
/ inst
/ tests
* run-all.R, test_1.R
/ extdata
* foo.txt, bar.csv
/ man
/ R
/ tests
* run-testthat-mypackage.R
(2) (Optional) Add a top-level tests directory so that your new testthat tests are now also run during R CMD check mypackage.
The run-testthat-mypackage.R script should have at minimum the following two lines:
library("testthat")
test_package("mypackage")
Note that this is the part that allows testthat to be called during R CMD check mypackage, and not necessary otherwise. You should add testthat as a "Suggests:" dependency in your DESCRIPTION file as well.
(3) Finally, the secret-sauce for specifying your within-package path:
barfile <- system.file("extdata", "bar.csv", package="mypackage")
bar <- read.csv(barfile)
# remainder of example/test code here...
If you look at the output of the system.file() command, it is returning the full system path to your package within the R framework. On Mac OS X this looks something like:
"/Library/Frameworks/R.framework/Versions/2.15/Resources/library/mypackage/extdata/bar.csv"
The reason this seems okay to me is that you don't hard code any path features other than those within your package, so this approach should be robust relative to other R installations on other systems.
data() approach
As for the data() semantics, as far as I can tell this is specific to R binary (.RData) files in the top-level data directory. So you can circumvent my example above by pre-importing the data files and saving them with the save() command into your data-directory. However, this assumes you only need to show an example in which the data is already loaded into R, as opposed to also reproducibly demonstrating the upstream process of importing the files.
Per #hadley's comment, the .RData conversion will work well.
As for the broader question of team collaboration with different environments across team members, a common pattern is to agree on a single environment variable, e.g., FOO_PROJECT_ROOT, that everyone on the team will set up appropriately in their environment. From that point on you can use relative paths, including across projects.
An R-specific approach would be to agree on some data/functions that every team member will set up in their .Rprofile files. That's, for example, how devtools finds packages in non-standard locations.
Last but not least, though it is not optimal, you can actually put developer-specific code in your repository. If #hadley does it, it's not such a bad thing. See, for example, how he activates certain behaviors in testthat in his own environment.

Resources