How to organize R scripts that use functions stored in R/ directory of R package - r

I have the following package structure:
mypackage/
|-- .Rbuildignore
|-- .gitignore
|-- DESCRIPTION
|-- NAMESPACE
|-- inst
|-- extdata
|-- mydata.csv
|-- vignettes
|-- R
|-- utils.R
`-- mypackage.Rproj
Currently I stored all the functions in R/ directory. My question is
where should I put scripts (e.g. named try_functions.R) to try the functions stored in R/, that scripts. It also use data stored in inst/extdata/
And in development process using RStudio, what's the workflow like to update and try the package after we add and fixed functions in R/.

It sounds to me like testthat is the package you are looking for. By "try", I presume you mean "test," and the way that it is canonically done for the testthat package is within a tests/testthat directory for the package.
Hadley's "Advanced R" book has a good deal more information about best practices, and you can find many good examples by looking at github.
Some excerpts from the docs:
Testing is a vital part of package development. It ensures that your
code does what you want it to do. Testing, however, adds an additional
step to your development workflow. The goal of this chapter is to show
you how to make this task easier and more effective by doing formal
automated testing using the testthat package.
And implementing:
To set up your package to use testthat, run:
devtools::use_testthat()
This will:
Create a tests/testthat directory.
Adds testthat to the Suggests field in the DESCRIPTION.
Creates a file tests/testthat.R that runs all your tests when R CMD
check runs. (You’ll learn more about that in automated checking.)
You also might look at the rprojroot package for referencing various places within the directory of the package.

The canonical place for keeping arbitrary R scripts is inst/ subdirectory.
Note that tests of your package functionality it is better to put in tests/ subdirectory. Loose scripts that are not tests (at least test of your package) should be in placed inst/. Those can be test scripts for checking deployment environment, test for checking production data quality, exec scripts to be plugged in crontab, whatever is useful/necessary in putting your package into action.
Quoting Writing R Extensions manual "Package subdirectories":
The contents of the inst subdirectory will be copied recursively to the installation directory. Subdirectories of inst should not interfere with those used by R (currently, R, data, demo, exec, libs, man, help, html and Meta, and earlier versions used latex, R-ex). The copying of the inst happens after src is built so its Makefile can create files to be installed. To exclude files from being installed, one can specify a list of exclude patterns in file .Rinstignore in the top-level source directory. These patterns should be Perl-like regular expressions (see the help for regexp in R for the precise details), one per line, to be matched case-insensitively against the file and directory paths, e.g. doc/.*[.]png$ will exclude all PNG files in inst/doc based on the extension.

Related

How to add external data folder into developing R package? [duplicate]

In the documentation, R suggests that raw data files (not Rdata nor Rda) should be placed in inst/extdata/
From the first paragraph in: http://cran.r-project.org/doc/manuals/R-exts.html#Data-in-packages
The data subdirectory is for data files, either to be made available
via lazy-loading or for loading using data(). (The choice is made by
the ‘LazyData’ field in the DESCRIPTION file: the default is not to do
so.) It should not be used for other data files needed by the package,
and the convention has grown up to use directory inst/extdata for such
files.
So, I have moved all of my raw data into this folder, but when I build and reload the package and then try to access the data in a function with (for example):
read.csv(file=paste(path.package("my_package"),"/inst/extdata/my_raw_data.csv",sep=""))
# .path.package is now path.package in R 3.0+
I get the "cannot open file" error.
However, it does look like there is a folder called /extdata in the package directory with the files in it (post-build and install). What's happening to the /inst folder?
Does everything in the /inst folder get pushed into the / of the package?
More useful than using file.path would be to use system.file. Once your package is installed, you can grab your file like so:
fpath <- system.file("extdata", "my_raw_data.csv", package="my_package")
fpath will now have the absolute path on your HD to the file.
You were both very close and essentially had this. A formal reference from 'Writing R Extensions' is:
1.1.3 Package subdirectories
[...]
The contents of the inst subdirectory will be copied recursively
to the installation directory. Subdirectories of inst should not
interfere with those used by R (currently, R, data, demo,
exec, libs, man, help, html and Meta, and earlier versions
used latex, R-ex). The copying of the inst happens after src
is built so its Makefile can create files to be installed. Prior to
R 2.12.2, the files were installed on POSIX platforms with the permissions in the package sources, so care should be taken to ensure
these are not too restrictive: R CMD build will make suitable
adjustments. To exclude files from being installed, one can specify a
list of exclude patterns in file .Rinstignore in the top-level
source directory. These patterns should be Perl-like regular
expressions (see the help for regexp in R for the precise details),
one per line, to be matched(10) against the file and directory paths,
e.g. doc/.*[.]png$ will exclude all PNG files in inst/doc based on
the (lower-case) extension.

R Package unable to access contents from `inst` folder [duplicate]

In the documentation, R suggests that raw data files (not Rdata nor Rda) should be placed in inst/extdata/
From the first paragraph in: http://cran.r-project.org/doc/manuals/R-exts.html#Data-in-packages
The data subdirectory is for data files, either to be made available
via lazy-loading or for loading using data(). (The choice is made by
the ‘LazyData’ field in the DESCRIPTION file: the default is not to do
so.) It should not be used for other data files needed by the package,
and the convention has grown up to use directory inst/extdata for such
files.
So, I have moved all of my raw data into this folder, but when I build and reload the package and then try to access the data in a function with (for example):
read.csv(file=paste(path.package("my_package"),"/inst/extdata/my_raw_data.csv",sep=""))
# .path.package is now path.package in R 3.0+
I get the "cannot open file" error.
However, it does look like there is a folder called /extdata in the package directory with the files in it (post-build and install). What's happening to the /inst folder?
Does everything in the /inst folder get pushed into the / of the package?
More useful than using file.path would be to use system.file. Once your package is installed, you can grab your file like so:
fpath <- system.file("extdata", "my_raw_data.csv", package="my_package")
fpath will now have the absolute path on your HD to the file.
You were both very close and essentially had this. A formal reference from 'Writing R Extensions' is:
1.1.3 Package subdirectories
[...]
The contents of the inst subdirectory will be copied recursively
to the installation directory. Subdirectories of inst should not
interfere with those used by R (currently, R, data, demo,
exec, libs, man, help, html and Meta, and earlier versions
used latex, R-ex). The copying of the inst happens after src
is built so its Makefile can create files to be installed. Prior to
R 2.12.2, the files were installed on POSIX platforms with the permissions in the package sources, so care should be taken to ensure
these are not too restrictive: R CMD build will make suitable
adjustments. To exclude files from being installed, one can specify a
list of exclude patterns in file .Rinstignore in the top-level
source directory. These patterns should be Perl-like regular
expressions (see the help for regexp in R for the precise details),
one per line, to be matched(10) against the file and directory paths,
e.g. doc/.*[.]png$ will exclude all PNG files in inst/doc based on
the (lower-case) extension.

Where to put a Dockerfile in an R package

A contributor has added a Dockerfile to my R package. When trying to upload it to CRAN, it gets flagged:
Non-standard file/directory found at top level:
'Dockerfile'
Is there a more appropriate placement for Dockerfiles within the library's directory structure?
Many thanks
You can leave it in the top level directory. Use the .Rbuildignore file to add an exclusion to the Dockerfile (and other non-standard files).
.Rbuildignore uses regex. Here's an example .Rbuildignore file:
^.*\.Rproj$
^\.Rproj\.user$
.travis.yml
.*.tar.gz
^local

Run unit tests with testthat without package

I have a shiny application which uses like 4 functions. I would like to test these functions but it's not a package. How am i supposed to structure my code ? and execute these tests without devtools ?
You can execute tests with testthat::test_dir() or testthat::test_file(). Neither relies on the code being in a package, or using devtools, just the testthat package.
There are few requirements on how to structure your code.
If it were me, I would create a tests directory and add my test scripts under there, which would look something like:
|- my_shiny_app
| |- app.R
| |- tests
| |- test_foo.R
| |- test_bar.R
Then you can run your tests with test_dir('tests'), assuming you're in the my_shiny_app directory.
Your test scripts will have they same structure they have for packages but you'd replace the library() call with source() referencing the file where your functions are defined.
If you have few functions without a package structure, it is better to write single test files manually (so with some simple if/error catching system) that you call with Rscript test_file1.R.
If you start to use the package format instead (which would be advisable for further 'safe' developing) and you still do not want to use testthat, I advise you to follow this blog post: here

Can I access an external file when testing an R package?

I am using the testthat package to test an R package that is within a larger repository. I would like to test the contents of a file outside of the R package.
Can I reference a file that is located outside of an R package while testing?
What I have tried
A reproducible example can be downloaded as MyRepo.tar.gz
My repository is called "myRepo", and it includes an R package, "myRpkg" and a folder full of miscellaneous scripts
~/MyRepo/
~/MyRepo/MyRpkg
~/MyRepo/Scripts
The tests in "MyRpkg" are in the /tests/ folder
~/myRepo/myRpkg/tests/test.myscript.R
And I want to be able to test a file in the Scripts folder:
~/MyRepo/Scripts/myscript.sh
I would like to read the script to test the contents of the first line doing something like this:
check.script <- readLines("../../../Scripts/myscript.sh")[1]
expect_true(grepl("echo", check.script))
This works fine if I start from the MyRepo directory:
cd ~/MyRepo
R CMD check MyRpkg
But if I move to another directory, it fails:
cd
R CMD check MyRepo/MyRpkg
As it says in R-exts
The directory tests is copied to the check area, and the tests are run with the copy as the working directory and with R_LIBS set to ensure that the copy of the package installed during testing will be found by library(pkg_name).
By default, the check directory is created in the current directory. Thus when running R CMD check from ~/MyRepo, the tests directory is copied to ~/MyRepo/MyRpkg.Rcheck/tests and hence
check.script <- readLines("../../../Scripts/myscript.sh")[1]
is interpreted as
check.script <- readLines("~/MyRepo/Scripts/myscript.sh")[1]
as required. However starting from ~/ would imply
check.script <- readLines("~/Scripts/myscript.sh")[1]
which isn't what you want. A work-around is to specify the directory in which the check directory is created, i.e.
R CMD check -o MyRepo MyRepo/MyRpkg
so that the copied tests directory has the same "grandparent" as the original tests directory.
Still, I wonder why the file must be external to the package. If you want to use the file in the package tests, it would make sense to include the file in the package. You could create an inst directory and put the Scripts directory in there, so that the Scripts directory will be copied to the package directory on installation and then
check.script <- readLines("../foo/Scripts/myscript.sh")[1]
could be used inside the test script, since the package is installed in MyRpkg.Rcheck/foo during R CMD check. Alternatively you could create an exec directory and put the script file in there, then
check.script <- readLines("../foo/exec/myscript.sh")[1]
would work. As both of these solutions only need to find the package installed during testing it wouldn't matter where you ran R CMD check from. See Package subdirectories and Non-R scripts in packages for more info.

Resources