How to call R script from another R script, both in same package? - r

I'm building a package that uses two main functions. One of the functions model.R requires a special type of simulation sim.R and a way to set up the results in a table table.R
In a sharable package, how do I call both the sim.R and table.R files from within model.R? I've tried source("sim.R") and source("R/sim.R") but that call doesn't work from within the package. Any ideas?
Should I just copy and paste the codes from sim.R and table.R into the model.R script instead?
Edit:
I have all the scripts in the R directory, the DESCRIPTION and NAMESPACE files are all set. I just have multiple scripts in the R directory. ~R/ has premodel.R model.R sim.R and table.R. I need the model.R script to use both sim.R and table.R functions... located in the same directory in the package (e.g. ~R/).

To elaborate on joran's point, when you build a package you don't need to source functions.
For example, imagine I want to make a package named TEST. I will begin by generating a directory (i.e. folder) named TEST. Within TEST I will create another folder name R, in that folder I will include all R script(s) containing the different functions in the package.
At a minimum you need to also include a DESCRIPTION and NAMESPACE file. A man (for help files) and tests (for unit tests) are also nice to include.
Making a package is pretty easy. Here is a blog with a straightforward introduction: http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/

As others have pointed out you don't have to source R files in a package. The package loading mechanism will take care of losing the namespace and making all exported functions available. So usually you don't have to worry about any of this.
There are exceptions however. If you have multiple files with R code situations can arise where the order in which these files are processed matters. Often it doesn't matter or the default order used by R happens to be fine. If you find that there are some dependencies within your package that aren't resolved properly you may be faced with a situation where a custom processing order for the R files is required. The DESCRIPTION file offers the optional Collate field for this purpose. Simply list all your R files in the order they should be processed to satisfy the dependencies.

If all your files are in R directory, any function will be in memory after you do a package build or Load_All.
You may have issues if you have code in files that is not in a function tho.
R loads files in alphabetical order.
Usually, this is not a problem, because functions are evaluated when they are called for execution, not at loading time (id. a function can refer another function not yet defined, even in the same file).
But, if you have code outside a function in model.R, this code will be executed immediately at time of file loading, and your package build will fail usually with a
ERROR: lazy loading failed for package 'yourPackageName'
If this is the case, wrap the sparse code of model.R into a function so you can call it later, when the package has fully loaded, external library too.
If this piece of code is there for initialize some value, consider to use_data() to have R take care of load data into the environment for you.
If this piece of code is just interactive code written to test and implement the package itself, you should consider to put it elsewhere or wrap it to a function anyway.
if you really need that code to be executed at loading time or really have dependency to solve, then you must add the collate line into DESCRIPTION file, as already stated by Peter Humburg, to force R to load files order.
Roxygen2 can help you, put before your code
#' #include sim.R table.R
call roxygenize(), and collate line will be generate for you into the DESCRIPTION file.
But even doing that, external library you may depend are not yet loaded by the package, leading to failure again at build time.
In conclusion, you'd better don't leave code outside functions in a .R file if it's located inside a package.

Since you're building a package, the reason why you're having trouble accessing the other functions in your /R directory is because you need to first:
library(devtools)
document()
from within the working directory of your package. Now each function in your package should be accessible to any other function. Then, to finish up, do:
build()
install()
although it should be noted that a simple document() call will already be sufficient to solve your problem.

Make your functions global by defining them with <<- instead of <- and they will become available to any other script running in that environment.

Related

Read a package NAMESPACE file

I am looking for a fast R solution to read a package NAMESPACE file. The solution should contain already preprocessed (and aggregated) records and separated imports and exports.
Unfortunately I can’t use getNamespaceExports("dplyr")/getNamespaceImports("dplyr") as they need the package to be loaded to the R session, which is too slow.
I need a solution which simply process a text from the NAMESPACE file. Any solutions using external packages as well as partial solutions would still be welcome.
The raw data we could grabbed with a call like readLines("https://raw.githubusercontent.com/cran/dplyr/master/NAMESPACE"). roxygen2 generated files are formatted properly, but this will be not true for all manually generated files.
EDIT:
Thanks to Konrad R. answer I could develop such functionality in my new CRAN package - pacs. I recommended to check pacs::pac_namespace function. There is even one function which goes one step further, comparing NAMESPACE files between different package versions pacs::pac_comapre_namespace.
The function is included in R as base::parseNamespaceFile. Unfortunately the function does not directly take a path as an argument. Instead it constructs the path from a package name and the library location. However, armed with this knowledge you should be able to call it; e.g.:
parseNamespaceFile('dplyr', .libPaths()[1L])
EDIT
Somebody has to remember that the whole packages imports (like import(rlang)) have to be still invoked with the same function and the exports for them extracted. Two core elements are using parse on NAMESPACE code and then using the recursive extract function parseDirective.

What is the difference between library()/require() and source() in r?

Looked around and still not sure what is the difference between library()/require() and source() in r? According to this SO question: What is the difference between require() and library()? it looks like library() and require() are the same thing, maybe one is legacy. Is source() for lazy developers that don't want to create a library? When do you use each of these constructs?
The differences between library and require are already well documented in What is the difference between require() and library()?.
So, I will focus on how source differs from these. In fact they are fundamentally quite different commands. Neither library nor require actually execute any code. They simply attach a namespace, in a lazy fashion, meaning that individual functions in the package are not run unless they are actually called later. Source on the other hand does something quite different which is to execute all of the code in the file at that time.
A small caveat: packages can be made to actually run some code at the time of package loading or attaching, via the .onLoad and .onAttach functions. Have a look here: https://stat.ethz.ch/R-manual/R-devel/library/base/html/ns-hooks.html
source runs the code in a .R file, line by line.
library and require load and attach R packages.
Is source() for lazy developers that don't want to create a library?
You're correct that source is for the cases when you don't have a package. Laziness is not the only reason, sometimes packages are not appropriate---packages provide functionality, but don't do things. Perhaps I have a script that pulls data from a database, fits a model, and makes some predictions. A package may provide functions to help me do that, but it does not actually do it. A script saved in a .R file and run with source() could run the commands and complete the task.
I do want to address this:
it looks like library() and require() are the same thing, maybe one is legacy.
They both do the same thing (load and attach a package). The main difference is that library() will throw an error and stop the script if the package is not available, whereas require() will return TRUE or FALSE depending on its success. The general consensus is that library is better so that your script stops with a nice clear error and you can install the missing package before proceeeding. The question linked has a more thorough discussion which I won't try to replicate here.

Finding help files for moduled package functions in R

I am writing an R package which is organized into modules as per Sebastian Warnholz's modules package. Each function is organized into its own R file under, e.g., R/m1/fun.R. Each one of those files begins with roxygen code. The modules are defined in another file, R/modules.R. Here's an idea of how that file is structured:
#' Module 1
#' #name m1
#' #export
m1 <- modules::module({
expose("R/m1/fun.R")
expose("R/m1/foo.R")
})
The package checks and builds cleanly and I can call functions by issuing m1$fun() and m1$foo(). However, calling the help files don't work, no matter what I try (i.e., combinations of ? or help() with the function names, with or without the module prefix m1$.
Actually, I can't even expect the help files to be there, because after running devtools::document(), the roxygen code is not converted into man/*.Rd files. So I guess my problem is having devtools::document() search into the R subfolders. Running devtools::document("R/m1") doesn't do the trick, though.
One thing that works is putting the function scripts in the parent folder, R/, but then they lose the module scope and the help files (but not the functions themselves) can be seen at package level. Moreover, the "usage" section will state "foo(...)" instead of "m1$foo(...)", which sounds inadequate but I am not sure is currently fixable. This is my first time working with modules, so I was wondering if there's a cleaner way of organizing my functions and help files.

Where to put R files that generate package data

I am currently developing an R package and want it to be as clean as possible, so I try to resolve all WARNINGs and NOTEs displayed by devtools::check().
One of these notes is related to some code I use for generating sample data to go with the package:
checking top-level files ... NOTE
Non-standard file/directory found at top level:
'generate_sample_data.R'
It's an R script currently placed in the package root directory and not meant to be distributed with the package (because it doesn't really seem useful to include)
So here's my question:
Where should I put such a file or how do I tell R to leave it be?
Is .Rbuildignore the right way to go?
Currently devtools::build() puts the R script in the final package, so I shouldn't just ignore the NOTE.
As suggested in http://r-pkgs.had.co.nz/data.html, it makes sense to use ./data-raw/ for scripts/functions that are necessary for creating/updating data but not something you need in the package itself. After adding ./data-raw/ to ./.Rbuildignore, the package generation should ignore anything within that directory. (And, as you commented, there is a helper-function devtools::use_data_raw().)

Including Script Files in an R Extension Package

I'm creating an R package and I need it to include a couple of non R script files which get called by one of my functions. I need these script files to be distributed with the package, naturally. So that leaves me with two questions:
a) In which directory of the package
tree should I place these files? b) Is that location mandatory or just convention?
Do I need to change any other
settings or configurations or will
they just get copied to the
directory mentioned in #1 and then I
can figure out the path using
system.file()?
I've tried to find the answer in the Writing R Extensions document, but it didn't jump out at me. And, of course, I didn't read the whole thing. Am I being too honest here?
I think you want either exec/ at the top-level (even though that is labeled 'still experimental, or subdirectory of inst as everything in inst/ gets copied verbatim into the package.
A quick example from the packages I have expanded in source is gdata which has inst/perl, inst/xls and inst/bin. These you could then call from R itself by computing the path of the installed package using system.file().

Resources