R package data not available when importing in another package - r

I have one package "testing" with a data object "test_data" saved in data folder under file name "test_data.RData".
testing contains one function hello() that uses this data object
#' hello
#'
#' #return Prints hello "your_name"
#' #export
#'
#' #examples
#' hello()
hello <- function(your_name = "") {
print(paste("test_data has", nrow(test_data), "rows"))
print(sprintf("Hello %s!", your_name))
}
the following code works fine:
require(testing)
testing::hello()
[1] "test_data has 32 rows"
[1] "Hello !"
but this fails:
testing::hello()
Error in nrow(test_data) : object 'test_data' not found
Actually I do not use it directly but in another package testingtop that imports this function:
#' Title
#'
#' #export
#' #importFrom testing hello
hello2 <- function(){
hello()
}
I have testing in the Imports section of DESCRIPTION and this fails.
require(testingtop)
testingtop::hello2()
Error in nrow(test_data) : object 'test_data' not found
If I put it in Depends it works if I load the package with library()
otherwise it still fails:
> library(testingtop)
Loading required package: testing
> testingtop::hello2()
[1] "test_data has 32 rows"
[1] "Hello !"
Restarting R session...
> testingtop::hello2()
Error in nrow(test_data) : object 'test_data' not found
if it was a function instead of a data object Imports would be fine, why is it different with a data object and I need to load the imported package? Did I miss something? And is it related to LazyData and LazyLoad ?
Probably a duplicate of this question

SO I think I've found the solution from the doc of the data function ?data
Use of data within a function without an envir argument has the almost always undesirable side-effect of putting an object in the user's workspace (and indeed, of replacing any object of that name already there). It would almost always be better to put the object in the current evaluation environment by data(..., envir = environment()). However, two alternatives are usually preferable, both described in the ‘Writing R Extensions’ manual.
For sets of data, set up a package to use lazy-loading of data.
For objects which are system data, for example lookup tables used in calculations within the function, use a file ‘R/sysdata.rda’ in the package sources or create the objects by R code at package installation time.
A sometimes important distinction is that the second approach places objects in the namespace but the first does not. So if it is important that the function sees mytable as an object from the package, it is system data and the second approach should be used.
Putting the data in the internal data file made my function hello2() see it
> testingtop::hello2()
[1] "test_data has 32 rows"
[1] "Hello !"

To supplement Benoit's answer. I had essentially this problem, but when using my package data as a default for a function parameter. In that case there's a third solution in the ?data help file: "In the unusual case that a package uses a lazy-loaded dataset as a default argument to a function, that needs to be specified by ::, e.g., survival::survexp.us."
This third approach solved it for me. (But I found it thanks to Benoit's link.)

Related

pkgdown fails parsing Rd files when examples are added

for some reason pkgdown is failing to parse one of the .Rd files that I have in my package. I found it fails when I add examples to the roxygen2 documentation using either the #examples tag or the #example inst/example/add.R alternative. I minimized my function to two arguments in order to make it more "reproducible" and still getting the same error. Please find bellow the error message, the .Rd file generated using that devtools::document() and the roxygen2 documentation of the function. As you can see I am using a very simple example that should run with no problems... One more thing to say is that when I run devtools::check() all my examples pass, so I don't understand why pkgdown is failing.
Thank you so much for your help.
Best,
Error message
Reading 'man/merge.Rd'
Error : Failed to parse Rd in merge.Rd
i unused argument (output_handler = evaluate::new_output_handler(value = pkgdown_print))
Error: callr subprocess failed: Failed to parse Rd in merge.Rd
i unused argument (output_handler = evaluate::new_output_handler(value = pkgdown_print))
.Rd file
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/merge.R
\name{merge}
\alias{merge}
\title{Merge two tables}
\usage{
merge(x, y)
}
\arguments{
\item{x}{data frame: referred to \emph{left} in R terminology, or \emph{master} in
Stata terminology.}
\item{y}{data frame: referred to \emph{right} in R terminology, or \emph{using} in
Stata terminology.}
}
\value{
a data.table joining x and y.
}
\description{
This is the main and, basically, the only function in joyn.
}
\examples{
x <- c(1, 2)
}
roxygen2 documentation
#' Merge two tables
#'
#' This is the main and, basically, the only function in joyn.
#'
#' #param x data frame: referred to *left* in R terminology, or *master* in
#' Stata terminology.
#' #param y data frame: referred to *right* in R terminology, or *using* in
#' Stata terminology.
#' #return a data.table joining x and y.
#' #export
#' #import data.table
#'
#' #examples
#' x <- c(1, 2)
This error comes from downlit::evaluate_and_highlight (sad it's not reported in the output), and can be fixed by installing the development version of downlit:
library(devtools)
install_github('r-lib/downlit')
It make sense only if you use the git version also from pkgdown, the stable pkgdown (version 1.6.1) runs just fine with the stable downlit. Of course development version of any package can break at any time, but until it doesn't, it's just alright.

Function imported from dependency not found, requires library(dependency)

I am trying to create an R package that uses functions from another package (gamlss.tr).
The function I need from the dependency is gamlss.dist::TF (gamlss.dist is loaded alongside gamlss.tr), but it is referenced in my code as simply TF within a call to gamlss.tr::gen.trun.
When I load gamlss.tr manually with library(), this works. However, when I rely on the functions of the dependency automatically being imported by my package through #import, I get an "object not found" error as soon as TF is accessed.
My attempt to be more explicit and reference the function I need as gamlss.dist::TF resulted in a different error ("unexpected '::'").
Any tips on how to use this function in my package would be much appreciated!
The code below reproduces the problem if incorporated into a clean R package (as done in this .zip), built and loaded with document("/path/to/package"):
#' #import gamlss gamlss.tr gamlss.dist
NULL
#' Use GAMLSS
#'
#' Generate a truncated distribution and use it.
#' #export
use_gamlss <- function() {
print("gen.trun():")
gamlss.tr::gen.trun(par=0,family=TF)
#Error in inherits(object, "gamlss.family") : object 'TF' not found
#gamlss.tr::gen.trun(par=0,family=gamlss.dist::TF)
#Error in parse(text = fname) : <text>:1:1: unexpected '::'
y = rTFtr(1000,mu=10,sigma=5, nu=5)
print("trun():")
truncated_dist = gamlss.tr::trun(par=0,family=TF, local=TRUE)
model = gamlss(y~1, family=truncated_dist)
print(model)
}
use_gamlss() will only start working once a user calls library(gamlss.tr).
This is due to bad design of gamlss.tr in particular the trun.x functions (they take character vectors instead of family objects / they evaluate everything in the function environment instead of the calling environment).
To work around this, you have to make sure that gamlss.distr is in the search path of the execution environment of gamlss.tr functions (This is why ## import-ing it in your package does not help: it would need to be #' #import-ed in gamlss.tr).
This can be achieved by adding it to Depends: of your package.
If you want to avoid that attaching your package also attaches gamlss.distr, you could also add the following at the top of use_gamlss:
nsname <-"gamlss.dist"
attname <- paste0("package:", nsname)
if (!(attname %in% search())) {
attachNamespace(nsname)
on.exit(detach(attname, character.only = TRUE))
}
This would temporarily attach gamlss.dist if it is not attached already.
You can read more on namespaces in R in Hadley Wickham's "Advanced R"

'data' is not an exported object from 'namespace:my_package'

I'm writing a function that uses an external data as follow:
First, it checks if the data is in the data/ folder, if it is not, it creates the data/ folder and then downloads the file from github;
If the data is already in the data/ folder, it reads it, and perform the calculations.
The question is, when I run:
devtools::check()
it returns:
Error: 'data' is not an exported object from 'namespace:my_package'
Should I manually put something on NAMESPACE?
An example:
my_function <- function(x){
if(file.exists("data/data.csv")){
my_function_calculation(x = x)
} else {
print("Downloading source data...")
require(RCurl)
url_base <-
getURL("https://raw.githubusercontent.com/my_repository/data.csv")
dir.create(paste0(getwd(),"/data"))
write.table(url_base,"data/data.csv", sep = ",", quote = FALSE)
my_function_calculation(x = x)
}
}
my_function_calculation <- function(x = x){
data <- NULL
data <- suppressMessages(fread("data/data.csv"))
#Here, I use data...
return(data)
}
It could not be the same in every case, but I've solved the problem by removing the data.R file on R/ folder.
data.R is a file describing all data presented in the package. I had it since the previous version of my code, that had the data built in, not remote (to be downloaded).
Removing the file solved my problem.
Example of data.R:
#' Name_of_the_data
#'
#' Description_of_the_Data
#'
#' #format A data frame with 10000 rows and 2 variables:
#' \describe{
#' \item{Col1}{description of Col1}
#' \item{Col2}{description of Col2}
#' }
"data_name"
No need to remove data.R in /R folder, you just need to decorate the documentation around the NULL keyword as follow:
#' Name_of_the_data
#'
#' Description_of_the_Data
#'
#' #format A data frame with 10000 rows and 2 variables:
#' \describe{
#' \item{Col1}{description of Col1}
#' \item{Col2}{description of Col2}
#' }
NULL
Generally, this happens when you have a mismatch between the names of one of the rda files in data folder and what is described in R/data.R.
In this case, the data reference in the error message is for data.csv, not the data folder. You need to have rda files in the data folder of a R package. If you want to download csv, you need to put them in inst/extdata.
This being said, you might want to consider using tempdir() to save those files in the temp folder of your session instead.
There's 3 things to check:
The documentation is appropriately named:
#' Name_of_the_data
#'
#' Description_of_the_Data
#'
#' #format A data frame with 10000 rows and 2 variables:
#' \describe{
#' \item{Col1}{description of Col1}
#' \item{Col2}{description of Col2}
#' }
data
That the RData file is appropriately named for export in the data/ folder.
That the RData file is loaded with the name data.
If documentation (1) is A, the Rdata file is A.RData (2), but the object (when loaded with load() ) is named B- you're going to get this error exactly.
The problem probably is because how your object was named when you save it.
Suppose I load a file a called it "d", then I save it (as is suggested) with save in the data/ directory as "data":
save(d, file = "data/data.rda")
Then you will run the clean and install package and you will get the following error:
Error: 'data' is not an exported object from 'namespace:YourPakage'
Looks like it does not matter how you declare your object in the roxygen documentation. I guess you must name your OBJECT with the same name you are going to save it and loaded it.
For example, load your dataset as "pib" object, then save as "pib.rda" and declare in roxygen "loadData.R" (for example) your "pib".
#' Datos del PIB
#'
#' #docType data
#'
#' #usage data(pib)
#'
#' #format An object of class ...
#'
#' #keywords datasets
#'
#' #references ----
#'
#' #source ----
#'
#' #examples
#' data(pib)
"pib"
I had this issue because I copied the .rda file into the R\data folder.
Issue was resolved by using usethis::use_data(DataObject) which automatically takes the raw-data (DataObject) file and adds it to the R\data folder within the R package directory.
When I was stumped by the error
Error: 'data' is not an exported object from 'namespace:my_package'
MrFlick's comment above saved me. I had simply changed the name of an .rda file in my data folder. I was unable to get devtools::document() to recreate the NAMESPACE file. The solution was to re-save the data into the .rda file. (Of course I should have remembered that when one loads from an .rda file the name of the R object(s) has nothing to do with the name of the .rda file so renaming the .rda file doesn't do much.)
I spent a few hours trying to fix this. Finally got it to work.
Notes:
Data files have to be of type "rda". "rds" won't work.
File names had to be lower case.
NULL in documentation name didn't work for me. Had to be a lower case string.
In general, it seems the same error message is caused by several things. Anything the checker doesn't like related to data files, it will issue the same error. Hard to debug under those circumstances.
I will add another trap. Working in RStudio
I have assigned a string to MyString and saved in the data folder of my package project:
save(MyString, file="./data/MyString.RData")
My ./R/data.R file contains documentation for this:
#' A character string
#'
"MyString"
This works. But you must use one file per object and not do save(X, Y, Z, file="BitsAndPieces.RData") and then document BitsAndPieces. If you do then you will get the error of this question. Which I did, needless to say.
I had the same error and I would be able to overcome the error as follows.
The data file located at: data/df.RData
The R documentation file located at: R/df.R
I have created the df.RData file by importing the df.txt file into R and using the save() function to create the .RData file. I used the following code block to create .RData file.
x=read.table("df.txt")
save(x,file="df.RData")
Then after running the RCMD check I get the same error as df is not an exported object from namespace "package name".
I have overcome the error by change the variable name of the df.RData file as
df=read.table("df.txt")
save(df,file="df.RData")
Restarting the session solved the problem for me. Somehow the environment was empty and after restart all objects were back, hence solving the diff.
I had the same issue with one of my packages, and I needed to add
LazyData: true
to my DESCRIPTION file.
I had this problem, even renaming the variables and uninstalling the probematic packages didn't work.
I did:
I was trying to carry out the process in a session (tab) of R that was already in use previously, where the terra package had already been requested. This session is not saved, but was being automatically saved to an image in ~/.RData every time Rstudio was closed. So every time I opened Rstudio it retrieved that section (image) and reloaded the previous state causing the conflict between packages.
I solved it by creating a new blank rmarkdown and closing all previously opened sessions, as well as clearing all saved data in the Rstudio "Global environment".
I encountered this "Error: 'weekly' is not an exported object from 'namespace:ISLR'' when I was trying the following:
library(ISLR)
w <- ISLR::weekly
The problem is somehow fixed by changing it to:
w = ISLR::weekly
The = sign made all the difference here.

Why does the #family tag add dots between words in roxygen2?

I am writing an R package and create documentation using roxygen2. I build the package and the documentation using the button Build & Reload in RStudio. According to RStudio's output, it uses devtools::document(roclets=c('rd', 'namespace')) to compile the documentation.
I want to use the #family tag to link together a number of functions in the documentation and this is where my problem occurs. This line
#' #family functions returning some object
is converted in the .Rd file to the following
\seealso{
Other functions.returning.some.object: \code{\link{test2}}
}
I don't want the dots between the words. I have an older package, where this does not happen, even if I recompile the documentation in the exact same setting as I compile the new package. I can see no fundamental difference between this older package and my new attempt.
I have written a very simple test package, where the problem also occurs. It contains a single R file (testpackage.R):
#' Test function 1
#'
#' #param x a number
#'
#' #family functions returning some object
#' #family aggregate functions
#'
#' #export
test1 <- function(x) {
x*x
}
#' Test function 2
#'
#' #param x a number
#'
#' #family functions returning some object
#' #family aggregate functions
#'
#' #export
test2 <- function(x) {
x*x*x
}
The DESCRIPTION file is
Package: testpackage
Type: Package
Title: Package for testing purposes
Version: 1.0
Date: 2015-05-21
Author: Me
Maintainer: Me <me#somewhere.com>
Description: Package for testing purposes
License: GPL-3
NAMESPACE is generated by roxygen. For the documentation test1.Rd, I get:
% Generated by roxygen2 (4.1.1): do not edit by hand
% Please edit documentation in R/testpackage.R
\name{test1}
\alias{test1}
\title{Test function 1}
\usage{
test1(x)
}
\arguments{
\item{x}{a number}
}
\description{
Test function 1
}
\seealso{
Other aggregate.functions: \code{\link{test2}}
Other functions.returning.some.object: \code{\link{test2}}
}
with the unwanted dots in the \seealso section. Clearly, the number of words in the #family tag seems not to matter. I have tried enclosing the text in quotation marks, various kinds of brackets, etc. with no positive effect. Of course, I could edit the Rd files, but this would miss the point of using roxygen2.
R CMD check runs without warnings or errors on testpackage.
Why do these dots appear? And how can I get rid of them?
This is a bug in roxygen2 -- I've logged an issue here. It effectively results from the use of unstack(), which performs some unwanted conversions.
I upgraded to Roxygen 5.0.0 (which is now the CRAN version) and found that the problem went away.

devtools roxygen package creation and rd documentation

I am new to roxygen and am struggling to see how to be able to use it to quickly create a new/custom package.
I.e. I would like to know the minimum requirements are to make a package called package1 using devtools, roxygen2/3 so that I can run the commands
require(package1)
fun1(20)
fun2(20)
to generate 2000 and 4000 random normals respectively
So lets take the simplest example.
If I have two functions fun1 and fun2
fun1 <- function(x){
rnorm(100*x)
}
and
fun2 <- function(y){
rnorm(200*y)
}
the params are numeric, the return values are numeric. I'm pretty sure this isn't an S3 method, lets call the titles fun1 and fun2....im not too sure what other info i would need to provide. I can put fun1 and fun2 in separate .R files and add abit of #' but am unsure to include all relevant requirements for roxygen and also am unsure what to include as relevant requiremetns and how to use it to create the rd documentation to go with a package are. I presume the namespace would just have the names fun1 and fun2? and the package description would just be some generic information relating to me...and the function of the package?
any step by step guides would be gladly received.
EDIT: The below is how far I got to start with...
I can get as far as the following to create a pacakge...but cant use roxygen to make the documentation...
package.skeleton(list = c("fun1","fun2"), name = "package1")
and here is where I am not sure if I am missing a bunch of steps or not...
roxygenise("package1")
so when trying to install i get the following error message
system("R CMD INSTALL package1")
* installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
* installing *source* package ‘package1’ ...
** R
** preparing package for lazy loading
** help
Warning: /path.to.package/package1/man/package1-package.Rd:32: All text must be in a section
*** installing help indices
Error in Rd_info(db[[i]]) :
missing/empty \title field in '/path.to.package/package1/man/fun1.Rd'
Rd files must have a non-empty \title.
See chapter 'Writing R documentation' in manual 'Writing R Extensions'.
* removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/package1’
I'm surprised #hadley says to not use package.skeleton in his comment. I would use package.skeleton, add roxygen comment blocks, then delete all the files in the "man" directory and run roxygenize. However, since Hadley says "Noooooooooo", here's the minimum you need to be able to build a package that passes R CMD check and exports your functions.
Create directory called "package1". Under that directory, create a file called DESCRIPTION and put this in it (edit it appropriately if you like):
DESCRIPTION
Package: package1
Type: Package
Title: What the package does (short line)
Version: 0.0.1
Date: 2012-11-12
Author: Who wrote it
Maintainer: Who to complain to <yourfault#somewhere.net>
Description: More about what it does (maybe more than one line)
License: GPL
Now create a directory called "R" and add a file for each function (or, you can put both of your functions in the same file if you want). I created 2 files: fun1.R and fun2.R
fun1.R
#' fun1
#' #param x numeric
#' #export
fun1 <- function(x){
rnorm(100*x)
}
fun2.R
#' fun2
#' #param y numeric
#' #export
fun2 <- function(y){
rnorm(200*y)
}
Now you can roxygenize your package
R> library(roxygen2)
Loading required package: digest
R> list.files()
[1] "package1"
R> roxygenize("package1")
Updating collate directive in /home/garrett/tmp/package1/DESCRIPTION
Updating namespace directives
Writing fun1.Rd
Writing fun2.Rd
Since you mentioned devtools in the title of your Q, you could use the build and install functions from devtools
build('package1')
install('package1')
Or you can exit R and use the tools that come with R to build/check/install.
$ R CMD build package1
$ R CMD check package1_0.0.1.tar.gz
$ R CMD INSTALL package1_0.0.1.tar.gz
Now, fire up R again to use your new package.
$ R --vanilla -q
library(package1)
fun1(20)
fun2(20)
But, figuring out the minimum requirements is unlikely to help you (or the users of your package) much. You'd be much better off studying one of the many, many packages that use roxgen2.
Here's a better version of the fun1.R file which still doesn't use all the roxygen tags that it could, but is much better than the bare minimum
Modified fun1.R
#' fun1
#'
#' This is the Description section
#'
#' This is the Details section
#'
#' #param x numeric. this is multiplied by 100 to determine the length of the returned vector
#' #return a numeric vector of random deviates of length \code{100 * x}
#' #author your name
#' #seealso \code{\link{fun2}}
#' #examples
#' fun1(2)
#' length(fun1(20))
#' #export
fun1 <- function(x){
rnorm(100*x)
}
Much later - You could let RoxygenReady prepare your functions with the minimal Roxygen annotation skeleton. It basically brings you from your 2 input functions to GSee's answer, which is the input of Roxygen2.

Resources