In R, how I load one package's git branch from another package?
There are two packages, call them producer and consumer1. I am refactoring my code by moving a bunch of function definitions and tests from producer to consumer1.
I'm creating git branches, rfctrProd and rfctrCons1 for producer and consumer1. In rfctrCons1, I need a statement doing something like
#` #import producer, gitBranch = rfctrProd
Also, I'll to do similarly with other packages which import producer, to make sure I haven't broken them either. (I think the functions I'm refactoring are only used by consumer1, but I want to be sure before I merge my changes.)
You don't use roxygen comments to specify the import branch, just the functions themselves. You can specify the branch in the DESCRIPTION file, under Remotes:. Assuming it's Github (the default), you can do:
Remotes:
username/producer/rfctrProd
If it's not Github, have a look here for the other syntax.
Related
I'm developing a package in RStudio with usethis, trying to make use of best practices. Previously, I had run usethis::use_tidy_eval(). Now, I'm using data.table, and set this up by running usethis::use_data_table(). I get a warning,
Warning message:
replacing previous import ‘data.table:::=’ by ‘rlang:::=’ when loading ‘breakdown’
because the NAMESPACE contains the the two lines:
importFrom(rlang,":=")
importFrom(data.table,":=")
It turns out I no longer need usethis::use_tidy_eval(), so I'd like to revert it and in doing so get rid of the warning.
How can I undo whatever usethis helper functions do? Must I edit the NAMESPACE myself? How do I know what else was modified by usethis::use_tidy_eval()? What about undoing usethis::use_pipe()?
Unless you made a Git commit before and after running that code, there's probably not an extremely easy way. The two options I'd consider would be:
Read the source code of the function. This can require some hopping around to find definitions of helper functions, but use_tidy_eval looks like it:
adds roxygen to Suggests in DESCRIPTION
adds rlang to Imports in DESCRIPTION
adds the template R file tidy-eval.R
asks you to run document() which is what actually updates the NAMESPACE. You can find the lines added by looking for the importFrom roxygen tags in the template file.
To undo this, you should just be able to delete all of the above. However, you need to be a bit careful - e.g. if you import functions from rlang outside of tidy-eval.R, removing it from DESCRIPTION might prevent installation. Hopefully any such issues would be revealed by devtools::check() if they do happen.
The other option would be to get an older version of your package, run use_tidy_eval() and document() and then compare the changes. That will be more comprehensive and might catch things I missed above, but the same caveats about not being able to necessarily just reverse everything still apply.
Same strategy for use_pipe().
Sidenote: there are probably ways to adequately qualify different uses of := so that both can coexist in your package, in case that would be preferable.
Say I have a package that has 5 packages in Depends of the DESCRIPTION file and I have just realised it is not a good practice to have this many packages in Depends due to inevitable import clashes that are starting to pop up as the number of function imports are increasing. I'd like to move, say only package pkg to Imports but I have no clue which functions of pkg are being used in my package. Ideally, I should have unit tests with full coverage of the package source code and by simply removing pkg from the dependencies, I will identify the pkg-specific imports from the test errors of could not find function "foo". But unfortunately, I do not have that breadth of test coverage. I was wondering if there is a more efficient way than going through all the package code to identify these imports.
That is very straightforward. Change
Depends: pkgA, pkgB, pgC
to
Imports: pkgA, pkgB, pgC
and also add this to the NAMESPACE file:
import("pkgA")
import("pkgB")
import("pkgC")
which will globally import all exported symbols so you can continue as before.
You can also selectively import via
importFrom("pkgA", "func1", "func2", "func3")
and if you run R CMD check it will actually (very helpfully) tell you which functions need this. The second method is somewhat more precise but a little more work to set up.
And I don't think we have a tool to remove 'spurious imports'. Finding which imports may be unused may be something you have to check manually (but trying to remove one and seeing if it still builds + checks fine).
I'm trying to run a testthat script using GitHub Actions.
I would like to test a functionality of my function that allows it to be combined with (many) external packages. Now I want to test these external packages for the R CMD Check but I don't want to load the external packages generally (i.e. putting them into the Description) - after all, most people will not use these external packages.
Any ideas how to just include an external package in the testing files but not in the DESCRIPTION?
Thanks!
I think you describe a very standard use of Suggests.
I see two related but separable issues:
You want to test something using CI, in this case GHA. That is fine. Because you control the execution of the code, you could move your code from the test runner to, say, inst/examples and call it explicitly. That way the standard check of 'is the package using undeclared code' passes as inst/examples is not checked
You want to not force other people to have to load these packages. That is fine too, and we have Suggests: for this! Read Section 1.1 of Writing R Extensions about all the detailed semantics. If your package invokes other packages via tests, the every R CMD check touches this (and the external packages) so they must be declared. But you already know that only "some" people will want to use this "some of the time": that is precisely what Suggests: does, and you bracket the use with if (requireNamespace(pkgHere, quietly=TRUE)).
You can go either way, or even combine both. But you cannot call packages from tests and not declare them.
I am trying to follow closely #hadley's book to learn best practices in writing R packages. And I was thrilled to read these lines about the philosophy of the book:
anything that can be automated, should be automated. Do as little as
possible by hand. Do as much as possible with functions.
So when I was reading about dependencies and the (sort of) confusing differences between import directives in the NAMESPACE file and the "Imports:" field in the DESCRIPTION file, I was hoping that roxygen2 would automatically handle both of them. After all
Every package mentioned in NAMESPACE must also be present in the
Imports or Depends fields.
I was hoping that roxygen2 would take every #import in my functions and make sure it is included in the DESCRIPTION file. But it does not do that automatically.
So I either have to add it manually to the DESCRIPTION file or almost manually using devtools::use_package.
Looking around for an answer, I found this question in SO, where #hadley confirms in the comments that
Currently, the namespace roclet will modify NAMESPACE but not
DESCRIPTION
and other posts (e.g. here or here) where collate_roclet is discussed, but "This only matters if your code has side-effects; most commonly because you’re using S4".
I wonder:
the reason that DESCRIPTION is not automatically updated (sort of contradicting the aforementioned philosophy, which is presumably shared by roxygen2) and
If someone has already crafted a way to do it
I have written a little R package for that task:
https://github.com/markusdumke/pkghelper
It scans the R Code and NAMESPACE for packages in use and adds them to the Imports section.
The namespace_roclet edits the NAMESPACE file based on the tags added in the script before the function. As there are three types of dependencies (Depends, Imports, and Suggests), a similar method as used by the namespace_roclet would require three different tags (notice Imports should be a different one, to differentiate it from the packages to attach in NAMESPACE).
If you are willing to take a semi-automated process, you could identify the packages you have used and add the missing ones to DESCRIPTION, in the adequate sections.
library(reinstallr)
package.dir <- getwd()
base_path <- normalizePath(package.dir)
files <- list.files(file.path(base_path, "R"), full.names = TRUE)
packages <- unique(reinstallr:::scan_for_packages(files)$package)
packages
Regarding the two bullets you wonder about at the bottom:
Updates to the DESCRIPTION file could be further automated with additional roclets, however already >4 years ago such a pull request was deferred:
https://github.com/klutometis/roxygen/pull/76
I have to assume that the guys would indeed rather have you use the devtools package for updating the DESCRIPTION file, instead of adding this to roxygen2. So in that sense, devtools would be the first available choice
I'm building a package that uses two main functions. One of the functions model.R requires a special type of simulation sim.R and a way to set up the results in a table table.R
In a sharable package, how do I call both the sim.R and table.R files from within model.R? I've tried source("sim.R") and source("R/sim.R") but that call doesn't work from within the package. Any ideas?
Should I just copy and paste the codes from sim.R and table.R into the model.R script instead?
Edit:
I have all the scripts in the R directory, the DESCRIPTION and NAMESPACE files are all set. I just have multiple scripts in the R directory. ~R/ has premodel.R model.R sim.R and table.R. I need the model.R script to use both sim.R and table.R functions... located in the same directory in the package (e.g. ~R/).
To elaborate on joran's point, when you build a package you don't need to source functions.
For example, imagine I want to make a package named TEST. I will begin by generating a directory (i.e. folder) named TEST. Within TEST I will create another folder name R, in that folder I will include all R script(s) containing the different functions in the package.
At a minimum you need to also include a DESCRIPTION and NAMESPACE file. A man (for help files) and tests (for unit tests) are also nice to include.
Making a package is pretty easy. Here is a blog with a straightforward introduction: http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/
As others have pointed out you don't have to source R files in a package. The package loading mechanism will take care of losing the namespace and making all exported functions available. So usually you don't have to worry about any of this.
There are exceptions however. If you have multiple files with R code situations can arise where the order in which these files are processed matters. Often it doesn't matter or the default order used by R happens to be fine. If you find that there are some dependencies within your package that aren't resolved properly you may be faced with a situation where a custom processing order for the R files is required. The DESCRIPTION file offers the optional Collate field for this purpose. Simply list all your R files in the order they should be processed to satisfy the dependencies.
If all your files are in R directory, any function will be in memory after you do a package build or Load_All.
You may have issues if you have code in files that is not in a function tho.
R loads files in alphabetical order.
Usually, this is not a problem, because functions are evaluated when they are called for execution, not at loading time (id. a function can refer another function not yet defined, even in the same file).
But, if you have code outside a function in model.R, this code will be executed immediately at time of file loading, and your package build will fail usually with a
ERROR: lazy loading failed for package 'yourPackageName'
If this is the case, wrap the sparse code of model.R into a function so you can call it later, when the package has fully loaded, external library too.
If this piece of code is there for initialize some value, consider to use_data() to have R take care of load data into the environment for you.
If this piece of code is just interactive code written to test and implement the package itself, you should consider to put it elsewhere or wrap it to a function anyway.
if you really need that code to be executed at loading time or really have dependency to solve, then you must add the collate line into DESCRIPTION file, as already stated by Peter Humburg, to force R to load files order.
Roxygen2 can help you, put before your code
#' #include sim.R table.R
call roxygenize(), and collate line will be generate for you into the DESCRIPTION file.
But even doing that, external library you may depend are not yet loaded by the package, leading to failure again at build time.
In conclusion, you'd better don't leave code outside functions in a .R file if it's located inside a package.
Since you're building a package, the reason why you're having trouble accessing the other functions in your /R directory is because you need to first:
library(devtools)
document()
from within the working directory of your package. Now each function in your package should be accessible to any other function. Then, to finish up, do:
build()
install()
although it should be noted that a simple document() call will already be sufficient to solve your problem.
Make your functions global by defining them with <<- instead of <- and they will become available to any other script running in that environment.