function leading to check error in automatically generated RcppExports.R - r

I am using Rcpp 0.12.11 and R 3.4.0.
When I upgraded Rcpp to 0.12.11, the automatically generated R file RcppExports.R by the Rcpp::compileAttributes started to give me slightly different function calls
run_graph_match <- function(A, B, algorithm_params) {
# Rcpp 0.12.10
.Call('RGraphM_run_graph_match', PACKAGE = 'RGraphM', A, B, algorithm_params)
# Rcpp 0.12.11
.Call(RGraphM_run_graph_match, A, B, algorithm_params)
}
Is there an easy way to explain the reason behind the change?
The latter function leads to errors when checking the R package. For example, errors such as
symbol 'RGraphM_run_graph_match' not in namespace:
.Call(RGraphM_run_graph_match, A, B, algorithm_params)

Congratulations, you've experienced the Section 5.4: Registering native routines requirement added in R 3.4.0. The requirement mandated the inclusion of a src/init.c file that registered each C++ function and their parameters. Thus, Rcpp 0.12.11 generates this file inside of the RcppExports.cpp. Meanwhile, the RcppExports.R file, which is what this question is based upon, has its context being dependent on whether the user appropriately sets useDynLib(pkgname, .registration=TRUE) or useDynLib(pkgname), where the later is not ideal as it does not take advantage of a new option introduced in Rcpp 0.12.11 discussed next.
As a result of this shift in CRAN policy, JJ Allaire, the creator of Attributes for Rcpp 1, was inspired to advance a suggestion made by Douglas Bates back in 2012 when attributes was first added. Specifically, the goal was to change the call from being string-based to being a symbol. The rationale behind the change is simply put that a symbol is onhand when the package loads vs. a string which has to be looked up and converted into a symbol each time the function is run. Therefore, symbol lookup is less expensive on repetitive calls when compared to the string based method of Rcpp in the past.
Basically, this line:
.Call('RGraphM_run_graph_match', PACKAGE = 'RGraphM', A, B, algorithm_params)
Involved R looking up the symbol on each call of the encompassing R function to access the C++ function.
Meanwhile, this line:
.Call(RGraphM_run_graph_match, A, B, algorithm_params)
is a direct call to the C++ function as the symbol is already in memory.
And those are primarily the reasons behind why Rcpp changed how RcppExports.R was automatically generated. One of the downside of this approach is the inability to globally export all functions like before. In particular, some users that had in their NAMESPACE file a global symbol export statement e.g.
exportPattern("^[[:alpha:]]+")
had to remove it and opt to manually specify what functions or variables should be exported.
For more details, you may wish to see the GitHub PR that introduced this feature:
https://github.com/RcppCore/Rcpp/pull/694
1: For more on Attributes, see my history post: http://thecoatlessprofessor.com/programming/rcpp/to-rcpp-attributes-and-beyond-from-inline/

Related

Should I write `.registration = true` for an "external" DLL in a R package?

When I do a package including some C code or using Rcpp, I type the roxygen code:
#' #useDynLib TheDLL, .registration=true
I did a package in which I included some DLLs created with Haskell, that I put in the inst/libs folder. I didn't type .registration=true in the roxygen code and the package works fine. Should I type it nevertheless? If so, what is the role of .registration=true?
I think you almost certainly shouldn't use it in a general-purpose DLL, but if the DLL was written specifically for R, maybe you should. It indicates that the dll calls R_registerRoutines from its R_init_DLLNAME function, so entry points can be saved into variables. For example, you might have a function named "foo". You can call it using
.Call("foo", ...)
without registering it, and R will need to search symbol tables for it at run time. Or you can register it and call it as
.Call(foo, ...)
and the search is unnecessary. This is discussed mainly in section 5.4.2 of "Writing R Extensions". I believe that if you specify .registration=true then R will use the registration information to find entry points, otherwise it needs to search through all the exports of the DLL, which is probably slower.

Source code of the function `grDevices:::C_col2rgb`?

How can I find the source C code of the function grDevices:::C_col2rgb?
I've been led to this function after benchmarking (using R pkg profvis) some RGL functions, namely rgl:::rgl.quads and functions called therein. The corresponding R function that wraps C_col2rgb is col2rgb from grDevices. I'm interested in looking at the source of C_col2rgb to see whether I could make a faster version.
And, in general, when you encounter a C function being used in R code, is there an expedite way of finding its source code?
Many thanks!
Normally when you want to view the source code of an R function, you can just type its name in the console and press enter. However, when that function is written in another language, such as C, and exposed to R, you will just eventually see (something like)
.Call(C_col2rgb, col, alpha)
where R calls the compiled code. To see the source code of such functions, you actually have to look at the package source code. The function you are talking about is in the grDevices package, which is part of what is often called "base R" (not (necessarily) to be confused with the R package base) -- the package ships with all R installations.
There is an R source code mirror on GitHub at https://github.com/wch/r-source that I like to consult if I need to look at R's source code. The code for the grDevices package is there at https://github.com/wch/r-source/tree/trunk/src/library/grDevices.
As I mentioned in the comments, you can find the code for C_col2rgb() at r-source/src/library/grDevices/src/colors.c. However, there it looks like it's just called col2rgb(). Is it really the same?
Yes. If you consult Writing R Extensions, Section 1.5.4, you see that
A NAMESPACE file can contain one or more useDynLib directives which allows shared objects that need to be loaded.... Using argument .fixes allows an automatic prefix to be added to the registered symbols, which can be useful when working with an existing package. For example, package KernSmooth has
    useDynLib(KernSmooth, .registration = TRUE, .fixes = "F_")
which makes the R variables corresponding to the Fortran symbols F_bkde and so on, and so avoid clashes with R code in the namespace.
We can see in the NAMESPACE file for grDevices
useDynLib(grDevices, .registration = TRUE, .fixes = "C_")
So, the C functions that are made available from this package will all be prefixed with C_ even though they aren't in the C source code. This lets you call both the R and the C functions col2rgb without causing any problems.

R: data.table function not working in package [duplicate]

This is very simple question.
I am extending someone's package. It currently uses packages A, B and they are listed in the DESCRIPTION file.
If I need functions from package C - to add a package to the dependencies - do I just add the package in the DESCRIPTION file and that is all that is needed? Into what section - Depends or Imports? Are there more other steps to make? Do I need to use prefix C::functionInC() once my code needs to use a package C function?
Short answer:
Add C to Imports: and when using the C functions, use the double semicolon prefix.
Longer context:
The link below provides the following advice
http://r-pkgs.had.co.nz/namespace.html#imports
R functions
If you are using just a few functions from another package, my recommendation is to note the package name in the Imports: field of the DESCRIPTION file and call the function(s) explicitly using ::, e.g., pkg::fun().
If you are using functions repeatedly, you can avoid :: by importing the function with #importFrom pgk fun. This also has a small performance benefit, because :: adds approximately 5 µs to function evaluation time.
Alternatively, if you are repeatedly using many functions from another package, you can import all of them using #import package. This is the least recommended solution because it makes your code harder to read (you can’t tell where a function is coming from), and if you #import many packages, it increases the chance of conflicting function names.

R: How do I make my package use another package?

This is very simple question.
I am extending someone's package. It currently uses packages A, B and they are listed in the DESCRIPTION file.
If I need functions from package C - to add a package to the dependencies - do I just add the package in the DESCRIPTION file and that is all that is needed? Into what section - Depends or Imports? Are there more other steps to make? Do I need to use prefix C::functionInC() once my code needs to use a package C function?
Short answer:
Add C to Imports: and when using the C functions, use the double semicolon prefix.
Longer context:
The link below provides the following advice
http://r-pkgs.had.co.nz/namespace.html#imports
R functions
If you are using just a few functions from another package, my recommendation is to note the package name in the Imports: field of the DESCRIPTION file and call the function(s) explicitly using ::, e.g., pkg::fun().
If you are using functions repeatedly, you can avoid :: by importing the function with #importFrom pgk fun. This also has a small performance benefit, because :: adds approximately 5 µs to function evaluation time.
Alternatively, if you are repeatedly using many functions from another package, you can import all of them using #import package. This is the least recommended solution because it makes your code harder to read (you can’t tell where a function is coming from), and if you #import many packages, it increases the chance of conflicting function names.

Changing the load order of files in an R package

I'm writing a package for R in which the exported functions are decorated by a higher-order function that adds error checking and some other boilerplate code.
However, because this code is at the top-level it is evaluated after parsing. These means that
the load order of the package files is important.
To give an equivalent but simplified example, suppose I have a package with two files (Negate2 and Utils), and I require Negate2.R to be loaded first for the function 'isfalse( )' to be defined without throwing an error.
# /Negate2.R
Negate2 <- Negate
# -------------------
# /Utils.R
istrue <- isTRUE
isfalse <- Negate2(istrue)
Is it possible to structure NAMESPACE, DESCRIPTION (collate) or another package file in order to change the load order of files? The internal working of the R package structure and CRAN are still black magic to me.
It is possible to get around this problem using awkward hacks, but the least repetitive way of solving this problem. The wrapper function must be a higher-order function, since it also changes the function call semantics of its input function. The package is code heavy (~6000 lines, 100 functions) so repetition would be...problematic.
Solution
As #Manetheran points out, to change the load order you just change the order of the file names in the DESCRIPTION file.
# /DESCRIPTION
Collate:
'Negate2.R'
'Utils.R'
The Collate: field of the DESCRIPTION file allows you to change the order files are loaded when the package is built.
I stumbled across the answer to this question yesterday while reading up on Roxygen. If you've been documenting your functions with Roxygen, it can try to intelligently order your R source files in the Collate: field (based on where S4 class and method definitions are). This can be done by adding "collate" to the roclets argument of roxygenize. Alternatively if you're developing in RStudio there is a simple box that can be checked under Build->Configure Build Tools->Configure... (Button next to "Generate documentation with Roxygen").
R loads files in alphabetical order. To change the order, Collate field could be used from the DESCRIPTION file.
roxygen2 provides an explicit way of saying that one file must be loaded before another: #include. The #include tag gives a space separated list of file names that should be loaded before the current file:
#' #include class-a.r
setClass("B", contains = "A")
If any #include tags are present in the package, roxygen2 will set the Collate field in the DESCRIPTION.
You need to run generation of roxygen2 documentation in order to changes to take effect.

Resources