I would like to include the mice::mice function in my package to perform imputation on my data.
I use Roxygen to list imports
#' #param data dataset to be used for imputation
#' #importFrom dplyr select_
#' #importFrom mice mice complete
#' #return A list
#' #export
#'
impute_data <- function(data, vars, seed)
{
data_used <- select_(data,vars)
mice_data <- complete(mice(data_used, seed = seed))
return(mice_data)
}
This function works fine when I test the code, however when I build the package and try to use it, I get the following error
Error in check.method(setup, data) :
The following functions were not found: mice.impute.pmm,mice.impute.pmm, mice.impute.pmm, mice.impute.pmm, mice.impute.pmm
I tried to add to the imports all the functions mentioned in the error but it had no effect whatsoever on the outcome.
What am I missing? I've never found such a problem.
You are forgetting to handle the DESCRIPTION file! You only handle impute_data.R.
Your question is quite similar to:
What roxygen should I put when I use a function of another package in my function
I gave answer there (Please search for similar questions before posting any question). For your case:
First, being aware of your
sessionInfo()
getwd() # your R's working directory
.libPaths() # your R's library location
Step0 Download and install the necessary packages:
library(roxygen2)
library(devtools)
library(digest)
Step1 Put all your related ".R" files (yourfunction1.R, yourfunction2.R, yourfunction3.R, impute_data.R) to your R's working directory.
Step2 Create your package skeleton in your R's working directory:
Be sure that there is no folder named "yourpackage" in your R's working directory before running the following command. (from R's console)
package.skeleton(name = "yourpackage", code_files = c("yourfunction1.R", "yourfunction2.R", "yourfunction3.R", "impute_data.R"), path = ".")
After running package.skeleton, the folder yourpackage is created in your R's Working Directory.
Delete Read-and-delete-me file from Windows Explorer.
Delete "yourpackage-package.Rd" file in YourR'sWorkingDirectory\yourpackage\man folder
(Do NOT delete "yourpackage.Rd" file in YourR'sWorkingDirectory\yourpackage\man folder!)
Step3 At the end of the preamble of your ".R" file (impute_data.R), put the following (if you had not done it so in Step1):
#' #importFrom mice mice
#' #importFrom mice complete
#' #export
impute_data <- function(...) {...
Step4 In the DESCRIPTION file of your package, in the Imports part, add:
Imports:
mice(>= VersionNumber)
where VersionNumber is the version number of the mice package you are using. You can find the version number by right-click any function (from yourpackage) in Object Browser of RevolutionREnterprise; and going the bottom of the resultant .html help file. There, the version number of the package is shown.
In Step2, package.skeleton automatically produced a NAMESPACE file whose content is:
exportPattern("^[[:alpha:]]+")
Do not handle this NAMESPACE file manually.
Step5 roxygenize the package you wanna create ("yourpackage")
library(roxygen2)
roxygenize("yourpackage")
Upon roxygenization, the content of the NAMESPACE file of yourpackage is automatically converted from exportPattern("^[[:alpha:]]+") to
# Generated by roxygen2: do not edit by hand
export(impute_data)
importFrom(mice,mice)
importFrom(mice,complete)
Step6 Build your package:
(first, delete "src-i386" and "src-x64" folders (if any) in YourR'sWorkingDirectoryFolder\yourpackage folder from Windows Explorer)
(Be sure again that there is no "yourpackage-package.Rd" file in YourR'sWorkingDirectory\yourpackage\man folder. If there is, delete it before building)
build("yourpackage")
Step7 Install your package:
install("yourpackage")
Step8 Check that all is going well by loading your package and running a function in the package.
library(yourpackage)
impute_data(a,b,1235) # "impute_data" is the function in the package "yourpackage"
Step9 Check that your package is loadable to CRAN (Comprehensive R Archieve Network) (if you wanna share your package):
(first, delete "src-i386" and "src-x64" folders (if any) in YourR'sWorkingDirectoryFolder\yourpackage folder from Windows Explorer)
(Be sure again that there is no "yourpackage-package.Rd" file in YourR'sWorkingDirectory\yourpackage\man folder. If there is, delete it before checking)
From DOS Command Prompt:
Start – cmd - Enter. Pass to R's working directory (your R's working directory is known via getwd()) and do CRAN check:
cd C:\Users\User\Documents\Revolution
R CMD check yourpackage
From R's console:
devtools::check("C:/Users/User/Documents/Revolution/yourpackage")
Hello and even if the post is older,
recently, I came across the same problem and the proposed solutions by
Erdogan CEVHER and mickkk did not work for me. I solved it by actively loading the mice package, while loading my own package. For more detailed information consult R-Package-Dependencies.
In addition to the steps required during package development, here is what I recommend:
Part 1: Add mice to the Depends: (not Import:) field in the DESCRIPTION file of your package.
Depends: mice (>= VERSIONNUMBER)
Part 2: Use import(mice) in NAMESPACE (only for devtools::check())
import(mice)
Part 3: Reference each function using mice::, for example
mice::mice(data, method="pmm")
Related
Imagine, you define an R function to share with a pal, only a single function. In case of, you decide later to include this function in a package, you document it using Roxygen comments and tags (e. g. #' #name my_function). Is it possible to produce a PDF from this single R file? If yes, how?
1) We will use the file lc.R as an example which we first download from github. First use kitten to create the boilerplate for a package. Copy lc.R to it. Then run document from devtools to roxygenize it and finally use Rd2pdf to create the pdf, lc.pdf .
library(devtools)
library(pkgKitten)
library(roxygen2)
# set up lc in lc.R to use as a test example
u <- "https://raw.githubusercontent.com/mailund/lc/master/R/lc.R"
download.file(u, "./lc.R")
# create package containing lc.R - ignore any NAMESPACE warnings
kitten("lc")
file.copy("lc.R", "./lc/R")
# roxygenize it generating an Rd file
document("lc")
file.copy("lc/man/lc.Rd", ".")
# convert Rd file to pdf
R <- file.path(R.home("bin"), "R")
cmd <- paste(R, "CMD Rd2pdf lc.Rd")
system(cmd, wait = FALSE)
2) There used to be a package on CRAN named document (or see gitlab) which does the same thing in one step but it was removed last year. Note that the document package depends on the fritools (or see gitlab) package which was also removed. The source of both are archived on CRAN and on gitlab and it may be possible to build them yourself.
3) This approach does not create a PDF but it does allow one to view formatted help for a script converting it from the roxygen2 markup to HTML showing it in the browser. Note that the box package should not be attached, i.e. do not use a library(box) statement. Assume that lc.R is in the current directory -- see the download.file statement in (1) above. The code below may generate warnings or errors but it still works to bring up the help for the lc function in lc.R showing it in the default browser.
box::use(./lc)
box::help(lc$lc)
I’ve written some R functions and dropped them into a script file using RStudio. These are bits of code that I use over and over, so I’m wondering how I might most easily create an R package out of them (for my own private use).
I’ve read various “how to” guides online but they’re quite complicated. Can anyone suggest an “idiot’s guide” to doing this please?
I've been involved in creating R packages recently, so I can help you with that. Before proceeding to the steps to be followed, there are some pre-requisites, which include:
RStudio
devtools package (for most of the functions involved in creation of a package)
roxygen2 package (for roxygen documentation)
In case you don't have the aforementioned packages, you can install them with these commands respectively:
install.packages("devtools")
install.packages("roxygen2")
Steps:
(1) Import devtools in RStudio by using library(devtools).
(devtools is a core package that makes creating R packages easier with its tools)
(2) Create your package by using:
create_package("~/directory/package_name") for a custom directory.
or
create_package("package_name") if you want your package to be created in current workspace directory.
(3) Soon after you execute this function, it will open a new RStudio session. You will observe that in the old session some lines will be auto-generated which basically tells R to create a new package with required components in the specified directory.
After this, we are done with this old instance of RStudio. We will continue our work on the new RStudio session window.
By far the package creation part is already over (yes, that simple) however, a package isn't directly functionable just by its creation plus the fact that you need to include a function in it requires some additional aspects of a package such as its documentation (where the function's title, parameters, return types, examples etc as mentioned using #param, #return etc - you would be familiar if you see roxygen documentation like in some github repositories) and R CMD checks to get it working.
I'll get to that in the subsequent steps, but just in case you want to verify that your package is created, you can look at:
The top right corner of the new RStudio session, where you can see the package name that you created.
The console, where you will see that R created a new directory/folder in the path that we specified in create_package() function.
The files panel of RStudio session, where you'll notice a bunch of new files and directories within your directory.
(4) As you mentioned in your words, you drop your functions in a script file - hence you will need to create the script first, which can be done using:
use_r("function_name")
A new R script will pop up in your working session, ready to be used.
Now go ahead and write your function(s) in it.
(5) After your done, you need to load the function(s) you have written for your package. This is accomplished by using the devtools::load_all() function.
When you execute load_all() in the console, you'll get to know that the functions have been loaded into your package when you'll see Loading package_name displayed in console.
You can try calling your functions after that in the console to verify that they work as a part of the package.
(6) Now that your function has been written and loaded into your package, it is time to move onto checks. It is a good practice to check the whole package as we make changes to our package. The function devtools::check() offers an easy way to do this.
Try executing check() in the console, it will go through a number of procedures checking your package for warnings/errors and give details for the same as messages on the screen (pertaining to what are the errors/warnings/notes). The R CMD check results at the end will contain the vital logs for you to see what are the errors and warnings you got along with their frequency.
If the functions in your package are written well, (with additional package dependencies taken care of) it will give you two warnings upon execution of check:
The first warning will be regarding the license that your package uses, which is not specified for a new pacakge.
The second should be the one for documentation, warning us that our code is not documented.
To resolve the first issue which is the license, use the use_mit_license("license_holder_name") command (or any other license which suits your package - but then for private use as you mentioned, it doesn't really matter what you specify if only your going to use it or not its to be distributed) with your name as in place of license_holder_name or anything which suits a license name.
This will add the license field in the .DESCRIPTION file (in your files panel) plus create additional files adding the license information.
Also you'll need to edit the .DESCRIPTION file, which have self-explanatory fields to fill-in or edit. Here is an example of how you can have it:
Package: Your_package_name
Title: Give a brief title
Version: 1.0.0.0
Authors#R:
person(given = "Your_first_name",
family = "Your_surname/family_name",
role = c("package_creator", "author"),
email = "youremailaddress#gmail.com",
comment = c(ORCID = "YOUR-ORCID-ID"))
Description: Give a brief description considering your package functionality.
License: will be updated with whatever license you provide, the above step will take care of this line.
Encoding: UTF-8
LazyData: true
To resolve the documentation warning, you'll need to document your function using roxygen documentation. An example:
#' #param a parameter one
#' #param b parameter two
#' #return sum of a and b
#' #export
#'
#' #examples
#' yourfunction(1,2)
yourfunction <- function(a,b)
{
sum <- a+b
return(sum)
}
Follow the roxygen syntax and add attributes as you desire, some may be optional such as #title for specifying title, while others such as #import are required (must) if your importing from other packages other than base R.
After your done documenting your function(s) using the Roxygen skeleton, we can tell our package that we have documented our functions by running devtools::document(). After you execute the document() command, perform check() again to see if you get any warnings. If you don't, then that means you're good to go. (you won't if you follow the steps)
Lastly, you'll need to install the package, for it to be accessible by R. Simply use the install() command (yes the same one you used at the beginning, except you don't need to specify the package here like install("package") since you are currently working in an instance where the package is loaded and is ready to be deployed/installed) and you'll see after a few lines of installation a statement like "Done (package_name)", which indicates the installation of our package is complete.
Now you can try your function by first importing your package using library("package_name") and then calling your desired function from the package. Thats it, congrats you did it!
I've tried to include the procedure in a lucid way (the way I create my R packages), but if you have any doubts feel free to ask.
I am new to R and I am trying to make a standalone executable so that my scripts can be run without development tools. I have created multiple R scripts containing different functions and have been using a main.r script to connect the other scripts. I have been using RStudio and using Source on each file to add them to the Global Environment and finally using Source on my main file to start executing my program. When attempting to build a binary package through:
Build > Build Binary Package
I was getting the error:
ERROR: The build directory does not contain a DESCRIPTION
file so cannot be built as a package.
So I created a package and now the error I get is
** preparing package for lazy loading
Error in reorderPopulation(pop_fitness_list) :
could not find function "reorderPopulation"
Error : unable to load R code in package 'EAtsp'
ERROR: lazy loading failed for package 'EAtsp'
* removing 'C:/Users/Ryan/AppData/Local/Temp/RtmpsXbv0j/temp_libpath27ec59515c59/EAtsp'
Error: Command failed (1)
Execution halted
Exited with status 1.
Can someone explain to me how to fix this problem?
EDIT: I have since added roxygen comments to each of my functions and they are all displaying within the NAMESPACE file but still have the same issue.
These are the files my R directory contains:
fitness.r
initDataset.r
main.r
operators.r
selection.r
The functions within fitness.r can be found from main.r with no problem so I moved the reorderPopulation function which was previously in selection.r to fitness.r and it can be found. Why can the functions inside the selection.r file and possibly the others not be found?
There's nothing reproducible, so I'll go through a hacked example that works, perhaps you can use it as a template for explaining what is different and why yours should still work.
./DESCRIPTION
Package: Porteous96
Title: This package does nothing
Version: 0.0.0.9000
Authors#R: person('r2evans', email='r2evans#ignore.stackoverflow.com', role=c('aut','cre'))
Description: This package still does nothing
Depends: R (>= 3.3.3)
License: MIT
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.0.1
(Go ahead and try to send an email there ... I don't think it'll bug me ...)
./NAMESPACE
After create:
# Generated by roxygen2: fake comment so roxygen2 overwrites silently.
exportPattern("^[^\\.]")
After document:
# Generated by roxygen2: do not edit by hand
export(reorderPopulation)
(Regardless, this file needs no manually editing, assuming you are either using roxygen2 with its #' #export clause, or you are using the default "export almost everything" without roxygen2.)
./R/reorderPopulation.R
#' Do or do not
#'
#' (There is no try.)
#' #param ... any arguments ultimately ignored
#' #return nothing, invisibly
#' #export
reorderPopulation <- function(...) {
cat("do nothing\n")
invisible(NULL)
}
unorderPopulation <- function(...) {
reorderPopulation()
cat("should not be found\n")
invisible(NULL)
}
./R/zzz.R
I added this file just to try to "find" one of the exported functions from within this package.
.onLoad <- function(libname, pkgname) {
reorderPopulation("ignored", "stuff")
}
I can get away with assuming the function is available, per ?.onLoad:
Note that the code in '.onLoad' and '.onUnload' should not assume
any package except the base package is on the search path.
Objects in the current package will be visible (unless this is
circumvented), but objects from other packages should be imported
or the double colon operator should be used.
Build and Execute
I actually started this endeavor with a template directory created by starting in the intended directory and running:
devtools::create(".")
# Creating package 'Porteous96' in 'C:/Users/r2/Projects/StackOverflow'
# No DESCRIPTION found. Creating with values:
# Package: Porteous96
# Title: What the Package Does (one line, title case)
# Version: 0.0.0.9000
# Authors#R: "My Real Name <myreal##email.address.com> [aut,cre]"
# Description: What the package does (one paragraph).
# Depends: R (>= 3.3.3)
# License: Call for information, please
# Encoding: UTF-8
# LazyData: true
# * Creating `Porteous96.Rproj` from template.
# * Adding `.Rproj.user`, `.Rhistory`, `.RData` to ./.gitignore
However, you can easily just use the samples I provided above and move forward without calling create. (It also includes some other files, e.g., ./.gitignore, ./Porteous96.Rproj, and ./.Rbuildignore, none of which are required in the rest of my process here. If you have them and they have non-default values, that might be good to know.)
From there, I edited/created the above files, then:
devtools::document(".")
# Updating Porteous96 documentation
# Loading Porteous96
# do nothing
# First time using roxygen2. Upgrading automatically...
# Writing NAMESPACE
# Writing reorderPopulation.Rd
(The reason you see "do nothing" above and below is that I put it in a function named .onLoad, triggered each time the library is loaded. This includes during devtools::document and devtools::install as well as the obvious library(Porteous96).
One side-effect of that is that a ./man/ directory is created with the applicable help files. In this case, a single file, reorderPopulation.Rd, no need to show it here.
devtools::install(".")
# Installing Porteous96
# "c:/R/R-3.3.3/bin/x64/R" --no-site-file --no-environ --no-save --no-restore \
# --quiet CMD INSTALL "C:/Users/r2/Projects/StackOverflow/Porteous96" \
# --library="C:/Users/r2/R/win-library/3.3" --install-tests
# * installing *source* package 'Porteous96' ...
# ** R
# ** preparing package for lazy loading
# ** help
# *** installing help indices
# ** building package indices
# ** testing if installed package can be loaded
# *** arch - i386
# do nothing
# *** arch - x64
# do nothing
# * DONE (Porteous96)
# Reloading installed Porteous96
# do nothing
For good measure, I close R and re-open it. (Generally unnecessary.)
library(Porteous96)
# do nothing
(Again, this is dumped to the console because of .onLoad.)
reorderPopulation()
# do nothing
unorderPopulation()
# Error: could not find function "unorderPopulation"
Porteous96:::unorderPopulation()
# do nothing
# should not be found
Wrap-Up
I'm guessing this does not solve your problem. It highlights about as much as I could glean from your question(s). Perhaps it provides enough framework where you can mention salient differences between my files and yours. Though answers are not meant for pre-solution discussion, I think it is sometimes necessary and useful.
After help from #r2evans I have managed to find a solution to the problem.
My main.r file was just a bunch of function calls with no function wrapping them. So I wrapped the function calls in a function and the function now looks as follows:
mainFunction <- function() {
source("R/initSetup.r")
initSetup()
...
}
initSetup.r contains more source() calls to the other files that I use. The program is then run using the command mainFunction() in the R console
I am new to R and I am trying to make a standalone executable so that my scripts can be run without development tools. I have created multiple R scripts containing different functions and have been using a main.r script to connect the other scripts. I have been using RStudio and using Source on each file to add them to the Global Environment and finally using Source on my main file to start executing my program. When attempting to build a binary package through:
Build > Build Binary Package
I was getting the error:
ERROR: The build directory does not contain a DESCRIPTION
file so cannot be built as a package.
So I created a package and now the error I get is
** preparing package for lazy loading
Error in reorderPopulation(pop_fitness_list) :
could not find function "reorderPopulation"
Error : unable to load R code in package 'EAtsp'
ERROR: lazy loading failed for package 'EAtsp'
* removing 'C:/Users/Ryan/AppData/Local/Temp/RtmpsXbv0j/temp_libpath27ec59515c59/EAtsp'
Error: Command failed (1)
Execution halted
Exited with status 1.
Can someone explain to me how to fix this problem?
EDIT: I have since added roxygen comments to each of my functions and they are all displaying within the NAMESPACE file but still have the same issue.
These are the files my R directory contains:
fitness.r
initDataset.r
main.r
operators.r
selection.r
The functions within fitness.r can be found from main.r with no problem so I moved the reorderPopulation function which was previously in selection.r to fitness.r and it can be found. Why can the functions inside the selection.r file and possibly the others not be found?
There's nothing reproducible, so I'll go through a hacked example that works, perhaps you can use it as a template for explaining what is different and why yours should still work.
./DESCRIPTION
Package: Porteous96
Title: This package does nothing
Version: 0.0.0.9000
Authors#R: person('r2evans', email='r2evans#ignore.stackoverflow.com', role=c('aut','cre'))
Description: This package still does nothing
Depends: R (>= 3.3.3)
License: MIT
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.0.1
(Go ahead and try to send an email there ... I don't think it'll bug me ...)
./NAMESPACE
After create:
# Generated by roxygen2: fake comment so roxygen2 overwrites silently.
exportPattern("^[^\\.]")
After document:
# Generated by roxygen2: do not edit by hand
export(reorderPopulation)
(Regardless, this file needs no manually editing, assuming you are either using roxygen2 with its #' #export clause, or you are using the default "export almost everything" without roxygen2.)
./R/reorderPopulation.R
#' Do or do not
#'
#' (There is no try.)
#' #param ... any arguments ultimately ignored
#' #return nothing, invisibly
#' #export
reorderPopulation <- function(...) {
cat("do nothing\n")
invisible(NULL)
}
unorderPopulation <- function(...) {
reorderPopulation()
cat("should not be found\n")
invisible(NULL)
}
./R/zzz.R
I added this file just to try to "find" one of the exported functions from within this package.
.onLoad <- function(libname, pkgname) {
reorderPopulation("ignored", "stuff")
}
I can get away with assuming the function is available, per ?.onLoad:
Note that the code in '.onLoad' and '.onUnload' should not assume
any package except the base package is on the search path.
Objects in the current package will be visible (unless this is
circumvented), but objects from other packages should be imported
or the double colon operator should be used.
Build and Execute
I actually started this endeavor with a template directory created by starting in the intended directory and running:
devtools::create(".")
# Creating package 'Porteous96' in 'C:/Users/r2/Projects/StackOverflow'
# No DESCRIPTION found. Creating with values:
# Package: Porteous96
# Title: What the Package Does (one line, title case)
# Version: 0.0.0.9000
# Authors#R: "My Real Name <myreal##email.address.com> [aut,cre]"
# Description: What the package does (one paragraph).
# Depends: R (>= 3.3.3)
# License: Call for information, please
# Encoding: UTF-8
# LazyData: true
# * Creating `Porteous96.Rproj` from template.
# * Adding `.Rproj.user`, `.Rhistory`, `.RData` to ./.gitignore
However, you can easily just use the samples I provided above and move forward without calling create. (It also includes some other files, e.g., ./.gitignore, ./Porteous96.Rproj, and ./.Rbuildignore, none of which are required in the rest of my process here. If you have them and they have non-default values, that might be good to know.)
From there, I edited/created the above files, then:
devtools::document(".")
# Updating Porteous96 documentation
# Loading Porteous96
# do nothing
# First time using roxygen2. Upgrading automatically...
# Writing NAMESPACE
# Writing reorderPopulation.Rd
(The reason you see "do nothing" above and below is that I put it in a function named .onLoad, triggered each time the library is loaded. This includes during devtools::document and devtools::install as well as the obvious library(Porteous96).
One side-effect of that is that a ./man/ directory is created with the applicable help files. In this case, a single file, reorderPopulation.Rd, no need to show it here.
devtools::install(".")
# Installing Porteous96
# "c:/R/R-3.3.3/bin/x64/R" --no-site-file --no-environ --no-save --no-restore \
# --quiet CMD INSTALL "C:/Users/r2/Projects/StackOverflow/Porteous96" \
# --library="C:/Users/r2/R/win-library/3.3" --install-tests
# * installing *source* package 'Porteous96' ...
# ** R
# ** preparing package for lazy loading
# ** help
# *** installing help indices
# ** building package indices
# ** testing if installed package can be loaded
# *** arch - i386
# do nothing
# *** arch - x64
# do nothing
# * DONE (Porteous96)
# Reloading installed Porteous96
# do nothing
For good measure, I close R and re-open it. (Generally unnecessary.)
library(Porteous96)
# do nothing
(Again, this is dumped to the console because of .onLoad.)
reorderPopulation()
# do nothing
unorderPopulation()
# Error: could not find function "unorderPopulation"
Porteous96:::unorderPopulation()
# do nothing
# should not be found
Wrap-Up
I'm guessing this does not solve your problem. It highlights about as much as I could glean from your question(s). Perhaps it provides enough framework where you can mention salient differences between my files and yours. Though answers are not meant for pre-solution discussion, I think it is sometimes necessary and useful.
After help from #r2evans I have managed to find a solution to the problem.
My main.r file was just a bunch of function calls with no function wrapping them. So I wrapped the function calls in a function and the function now looks as follows:
mainFunction <- function() {
source("R/initSetup.r")
initSetup()
...
}
initSetup.r contains more source() calls to the other files that I use. The program is then run using the command mainFunction() in the R console
I'm working on developing an R package, using devtools, testthat, and roxygen2. I have a couple of data sets in the data folder (foo.txt and bar.csv).
My file structure looks like this:
/ mypackage
/ data
* foo.txt, bar.csv
/ inst
/ tests
* run-all.R, test_1.R
/ man
/ R
I'm pretty sure 'foo' and 'bar' are documented correctly:
#' Foo data
#'
#' Sample foo data
#'
#' #name foo
#' #docType data
NULL
#' Bar data
#'
#' Sample bar data
#'
#' #name bar
#' #docType data
NULL
I would like to use the data in 'foo' and 'bar' in my documentation examples and unit tests.
For example, I would like to use these data sets in my testthat tests by calling:
data(foo)
data(bar)
expect_that(foo$col[1], equals(bar$col[1]))
And, I would like the examples in the documentation to look like this:
#' #examples
#' data(foo)
#' functionThatUsesFoo(foo)
If I try to call data(foo) while developing the package, I get the error "data set 'foo' not found". However, if I build the package, install it, and load it - then I can make the tests and examples work.
My current work-arounds are to not run the example:
#' #examples
#' \dontrun{data(foo)}
#' \dontrun{functionThatUsesFoo(foo)}
And in the tests, pre-load the data using a path specific to my local computer:
foo <- read.delim(pathToFoo, sep="\t", fill = TRUE, comment.char="#")
bar <- read.delim(pathToBar, sep=";", fill = TRUE, comment.char="#"
expect_that(foo$col[1], equals(bar$col[1]))
This does not seem ideal - especially since I'm collaborating with others - requiring all the collaborators to have the same full paths to 'foo' and 'bar'. Plus, the examples in the documentation look like they can't be run, even though once the package is installed, they can.
Any suggestions? Thanks much.
Importing non-RData files within examples/tests
I found a solution to this problem by peering at the JSONIO package, which obviously needed to provide some examples of reading files other than those of the .RData variety.
I got this to work in function-level examples, and satisfy both R CMD check mypackage as well as testthat::test_package().
(1) Re-organize your package structure so that example data directory is within inst. At some point R CMD check mypackage told me to move non-RData data files to inst/extdata, so in this new structure, that is also renamed.
/ mypackage
/ inst
/ tests
* run-all.R, test_1.R
/ extdata
* foo.txt, bar.csv
/ man
/ R
/ tests
* run-testthat-mypackage.R
(2) (Optional) Add a top-level tests directory so that your new testthat tests are now also run during R CMD check mypackage.
The run-testthat-mypackage.R script should have at minimum the following two lines:
library("testthat")
test_package("mypackage")
Note that this is the part that allows testthat to be called during R CMD check mypackage, and not necessary otherwise. You should add testthat as a "Suggests:" dependency in your DESCRIPTION file as well.
(3) Finally, the secret-sauce for specifying your within-package path:
barfile <- system.file("extdata", "bar.csv", package="mypackage")
bar <- read.csv(barfile)
# remainder of example/test code here...
If you look at the output of the system.file() command, it is returning the full system path to your package within the R framework. On Mac OS X this looks something like:
"/Library/Frameworks/R.framework/Versions/2.15/Resources/library/mypackage/extdata/bar.csv"
The reason this seems okay to me is that you don't hard code any path features other than those within your package, so this approach should be robust relative to other R installations on other systems.
data() approach
As for the data() semantics, as far as I can tell this is specific to R binary (.RData) files in the top-level data directory. So you can circumvent my example above by pre-importing the data files and saving them with the save() command into your data-directory. However, this assumes you only need to show an example in which the data is already loaded into R, as opposed to also reproducibly demonstrating the upstream process of importing the files.
Per #hadley's comment, the .RData conversion will work well.
As for the broader question of team collaboration with different environments across team members, a common pattern is to agree on a single environment variable, e.g., FOO_PROJECT_ROOT, that everyone on the team will set up appropriately in their environment. From that point on you can use relative paths, including across projects.
An R-specific approach would be to agree on some data/functions that every team member will set up in their .Rprofile files. That's, for example, how devtools finds packages in non-standard locations.
Last but not least, though it is not optimal, you can actually put developer-specific code in your repository. If #hadley does it, it's not such a bad thing. See, for example, how he activates certain behaviors in testthat in his own environment.