Ada supports the renaming of packages (see here).
Unfortunately I have to work on some legacy Ada code and the previous developer renamed packages excessively: more than 100 renames in a single package specification/body are not unusual. In my opinion it is hard to read the Ada code, especially because the renames are not consistent.
So I thought about removing such renames and replace all usages of the renamed package identifier with the original package identifier. I could do it by hand but before starting I would like to know if this task could be automated?
After the removal of all renamed packages I would like to check for each compilation unit if the original code and the refactored code have the same semantic. Could this be done by the compiler output?
Related
I'm developing a package (golem) in R, and it returns a NOTE about excess package in an Import (DESCRIPTION):
checking package dependencies … NOTE
Imports includes 34 non-default packages.
Importing from so many packages makes the package vulnerable to any of
them becoming unavailable. Move as many as possible to Suggests and
use conditionally.
I have allocated some packages in Suggests (DESCRIPTION), like this:
usethis::use_package(package = "ggplot2", type = "Suggests")
usethis::use_package(package = "MASS", type = "Suggests")
I would like to know :
What is the difference between Imports (run-time) vs Suggests (develop-time) and if the latter has anything to do with the term "compile time" of other programming languages.
How do I know a package is needed by the user at runtime? Is there any universal rule for this (like a phrase to help you know)? And for Suggests?
In R, packages listed in the Imports clause of the DESCRIPTION file must be available or your package won't load. Normally they will all be loaded when your package is loaded, though it's possible to delay that by not importing anything, just using :: notation to access them.
Packages listed in the Suggests clause don't need to be available, and won't be automatically loaded. To access their functions, you normally call requireNamespace() to find out if the package is available, and if so use :: for access. If it is not available, your package should fail gracefully in whatever the user was trying to do, letting them know that they need to install the missing package if they want the task to succeed.
These aren't really "run-time" versus "develop-time" differences. It's all run-time.
There are two things in R that might be called "compile-time" in other languages. The best match is installing your package. That configures it to the particular R version and platform it is running on. R also has a "just-in-time" compiler that optimizes functions, but other than a bit of a speed increase that is pretty much invisible to the user.
I think #r2evans answered your second question clearly in a comment: the user needs a package to use functions that use that package. If some of your functions that use it are unlikely to be used by most users, use Suggests, and add the test.
I have already made a simple R package (pure R) to solve a problem with brute force then I tried to faster the code by writing the Rcpp script. I wrote a script to compare the running time with the "bench" library. now, how can I add this script to my package? I tried to add
#'#importFrom Rcpp cppFunction
on top of my R script and inserting the Rcpp file in the scr folder but didn't work. Is there a way to add it to my r package without creating the package from scratch? sorry if it has already been asked but I am new to all this and completely lost.
That conversion is actually (still) surprisingly difficult (in the sense of requiring more than just one file). It is easy to overlook details. Let me walk you through why.
Let us assume for a second that you started a working package using the R package package.skeleton(). That is the simplest and most general case. The package will work (yet have warning, see my pkgKitten package for a wrapper than cleans up, and a dozen other package helping functions and packages on CRAN). Note in particular that I have said nothing about roxygen2 which at this point is a just an added complication so let's focus on just .Rd files.
You can now contrast your simplest package with one built by and for Rcpp, namely by using Rcpp.package.skeleton(). You will see at least these differences in
DESCRIPTION for LinkingTo: and Imports
NAMESPACE for importFrom as well as the useDynLib line
a new src directory and a possible need for src/Makevars
All of which make it easier to (basically) start a new package via Rcpp.package.skeleton() and copy your existing package code into that package. We simply do not have a conversion helper. I still do the "manual conversion" you tried every now and then, and even I need a try or two and I have seen all the error messages a few times over...
So even if you don't want to "copy everything over" I think the simplest way is to
create two packages with and without Rcpp
do a recursive diff
ensure the difference is applied in your original package.
PS And remember that when you use roxygen2 and have documentation in the src/ directory to always first run Rcpp::compileAttributes() before running roxygen2::roxygenize(). RStudio and other helpers do that for you but it is still easy to forget...
I am currently developing an R package and want it to be as clean as possible, so I try to resolve all WARNINGs and NOTEs displayed by devtools::check().
One of these notes is related to some code I use for generating sample data to go with the package:
checking top-level files ... NOTE
Non-standard file/directory found at top level:
'generate_sample_data.R'
It's an R script currently placed in the package root directory and not meant to be distributed with the package (because it doesn't really seem useful to include)
So here's my question:
Where should I put such a file or how do I tell R to leave it be?
Is .Rbuildignore the right way to go?
Currently devtools::build() puts the R script in the final package, so I shouldn't just ignore the NOTE.
As suggested in http://r-pkgs.had.co.nz/data.html, it makes sense to use ./data-raw/ for scripts/functions that are necessary for creating/updating data but not something you need in the package itself. After adding ./data-raw/ to ./.Rbuildignore, the package generation should ignore anything within that directory. (And, as you commented, there is a helper-function devtools::use_data_raw().)
I get this warning
Non-standard file/directory found at top level:
‘data-raw’
when building my package, even there is the recommendation of creating this folder to create package data http://r-pkgs.had.co.nz/data.html#data-sysdata
Any comments on that or do I need a specific setting to get rid of this message.
When used, data-raw should be added to .Rbuildignore. As explained in the Data section of Hadley's R-Packages book (also linked in the question)
Often, the data you include in data/ is a cleaned up version of raw data you’ve gathered from elsewhere. I highly recommend taking the time to include the code used to do this in the source version of your package. This will make it easy for you to update or reproduce your version of the data. I suggest that you put this code in data-raw/. You don’t need it in the bundled version of your package, so also add it to .Rbuildignore. Do all this in one step with:
usethis::use_data_raw()
I am quite new to R but it seems, this question is closely related to the following post 1, 2, 3 and a bit different topic 4. Unfortunately, I have not enough reputation to comment right there. My problem is that after going through all the suggestions there, the code still does not work:
I included "Depends" in the description file
I tried the second method including a change of NAMESPACE (Not reproducable)
I created a example package here containing a very small part of the code which showed a bit different error ("J" not found in routes[J(lat1, lng1, lat2, lng2), .I, roll = "nearest", by = .EACHI] instead of 'lat1' not found in routes[order(lat1, lng1, lat2, lng2, time)])
I tested all scripts using the console and R-scripts. There, the code ran without problems.
Thank you very much for your support!
Edit: #Roland
You are right. Roxygen overwrites the namespace. You have to include #' #import data.table to the function. Do you understand, why only inserting Depends: data.table in the DESCRIPTION file does not work? This might be a useful hint in the documentation or did I miss it?
It was missleading that changing to routes <- routes[order("lat1", "lng1", "lat2", "lng2", "time")] helped at least a bit as this line was suddenly no problem any more. Is it correct, that in this case data.frame order is used? I will see how far I get now. I will let you know the final result...
Answering your questions (after edit).
Quoting R exts manual:
Almost always packages mentioned in ‘Depends’ should also be imported from in the NAMESPACE file: this ensures that any needed parts of those packages are available when some other package imports the current package.
So you still should have import in NAMESPACE despite the fact if you depends or import data.table.
The order call doesn't seems to be what you expect, try the following:
order("lat1", "lng1", "lat2", "lng2", "time")
library(data.table)
data.table(a=2:1,b=1:2)[order("a","b")]
In case of issues I recommend to start debugging by writing unit test for your expected results. The most basic way to put unit tests in package is just plain R script in tests directory having stopifnot(...) call. Be aware you need to library/require your package at the start of the script.
This is more in addition to the answers above: I found this to be really useful...
From the docs [Hadley-description](http://r-pkgs.had.co.nz/description.html und)
Imports packages listed here must be present for your package to
work. In fact, any time your package is installed, those packages
will, if not already present, be installed on your computer
(devtools::load_all() also checks that the packages are installed).
Adding a package dependency here ensures that it’ll be installed.
However, it does not mean that it will be attached along with your
package (i.e., library(x)). The best practice is to explicitly refer
to external functions using the syntax package::function(). This
makes it very easy to identify which functions live outside of your
package. This is especially useful when you read your code in the
future.
If you use a lot of functions from other packages this is rather
verbose. There’s also a minor performance penalty associated with
:: (on the order of 5$\mu$s, so it will only matter if you call the
function millions of times).
From the docs Hadley-namespace
NAMESPACE also controls which external functions can be used by your
package without having to use ::. It’s confusing that both
DESCRIPTION (through the Imports field) and NAMESPACE (through import
directives) seem to be involved in imports. This is just an
unfortunate choice of names. The Imports field really has nothing to
do with functions imported into the namespace: it just makes sure the
package is installed when your package is. It doesn’t make functions
available. You need to import functions in exactly the same way
regardless of whether or not the package is attached.
... this is what I recommend: list the package in DESCRIPTION so that it’s
installed, then always refer to it explicitly with pkg::fun().
Unless there is a strong reason not to, it’s better to be explicit.
It’s a little more work to write, but a lot easier to read when you
come back to the code in the future. The converse is not true. Every
package mentioned in NAMESPACE must also be present in the Imports or
Depends fields.