R - setting up my own CRAN repository - r

I want to set up a local CRAN repository . I want to put just one package in this repository ( let's call it MyPackage ). The reason I'm doing this is that I want to share this package with people at my company. By the way - we all use Ubuntu Linux.
I have already done this:
I have a web server ( BOA web server ) and made a web folder called R. Made folder src and contrib.
In the contrib folder I put my package MyPackage ( tar.gz) plus the PACKAGES file.
However, when I do this:
install.packages("MyPackage", repos = "127.0.0.1/R" )
it does not work ;
Warning: unable to access index for repository [ ]
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
package ‘MyPackage’ is not available (for R version 2.13.1)
Can you guys guide me a bit and tell me what is the correct folder structure ?
Thanks.

See "Section 6.6 Setting up a package repository" of the R Admin manual.
Edit some three+ years later: We now have the drat package which automates creating a repository, and can use GitHub in a clever way to host it for you.

You might just need to specify the URL properly; http://127.0.0.1/R.
Also, make sure you can access that URL in your browser.

The miniCRAN works good for me. There are few advantages to using miniCRAN to create the repository:
Security: Many R users are accustomed to downloading and installing new R packages at will, from CRAN or one of its mirror sites.
Easier offline installation: To install package to an offline server requires that you also download all package dependencies, Using miniCRAN makes it easier to get all dependencies in the correct format.
Improved version management: In a multiuser environment, there are good reasons to avoid unrestricted installation of multiple package versions on the server.
Use other R Package Indexes: You may wish to make packages available from public repositories other than CRAN, e.g. BioConductor, r-forge, OmegaHat, etc.
Prepare own R repo: You may wish to add custom in-house packages to your repository.
See intro:
Using miniCRAN to create a local CRAN repository
Create a local package repository using miniCRAN

I think the problem is revealed in this statement:
"In the contrib folder I put ... the PACKAGES file."
The PACKAGES file is an index for the repository. You need to create that file after your package files are placed in the repository directory. Don't copy and paste the PACKAGES file from another repository.
If I were you, here's what I would do. First, add the following code to your .Rprofile for a local repository:
utils::setRepositories(ind = 0, addURLs = c(WORK = "127.0.0.1/R"))
Restart R after changing your .Rprofile.
ind = 0 will indicate that you only want the local repository. Additional repositories can be included in the addURLs = option and are comma separated within the character vector.
Then, create the repository index:
tools::write_PACKAGES("127.0.0.1/R/src/contrib", verbose = TRUE)
After you do that, you should be able to generate a data frame that has a list of all the packages. For example, my_packages <- available.packages().
If you see packages in your repository data frame, then install using the following code:
install.packages("MyPackage")
For more information, please see here.

Related

Is there a way to 'install' R packages without running install.packages()?

We are testing how to run R in the cloud in a secure isolated environment that is blocked from CRAN and also cannot use packages.install(). We defined an environment which is based on R essentials Anaconda's bundle, still we would like to be able to customize it on demand with extra packages. Is there a way to be able to simulate packages.install(), e.g. by offline downloading the package, zip it, copy to the secure environment and unzipping it to a specific location in the library folder?
thanks!
You can download the package from CRAN as a zip and then transport it to the isolated PC as a file. For example, here is the link to dplyr on CRAN: https://cran.r-project.org/web/packages/dplyr/index.html
Then use the code below to install the local file:
install.packages("~/Downloads/dplyr_1.0.7.zip", repos = NULL)
On Windows you might require Rtools. At least there was a Warning about it but the package still installed.
For Linux machines, you can build the package from source using the tarball from the same page:
install.packages("~/Downloads/dplyr_1.0.7.tar.gz", repos = NULL, type = "source")
In both cases you need to take care of dependencies yourself as they are not checked while installing through this method (look at the "imports" field on the CRAN website for the package).

Non-standard Remotes package INLA in R package

I have a package that Requires INLA, which is not hosted on CRAN or a standard GitHub repository. There are multiple SO questions detailing how to install the package on a personal machine, such as this, or even mentions it as a dependency in a package.
The two ways that are typically recommended to install on a personal machine are:
Direct from INLA website
install.packages("INLA",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE)
From the GitHub host
devtools::install_github(repo = "https://github.com/hrue/r-inla", ref = "stable", subdir = "rinla", build = FALSE)
Now, these are fine for personal machines, but don't work in the DESCRIPTION files Remotes: section.
If we do url::https://inla.r-inla-download.org/R/stable, this gives an error that the file extension isn't recognized.
Error: Error: Failed to install 'unknown package' from URL:
Don't know how to decompress files with extension
If we do github::hrue/r-inla, I am unaware of how to pass (or if it's even possible) the ref, subdir, and build arguments in the DESCRIPTION file.
Previous packages used a read only mirror of the INLA code that was hosted on GitHub, solely for this purpose, at this repo and then just using github::inbo/INLA. However, this repository is out of date.
Current solution
What I'm doing instead is to directly reference the tarball hosted on the main webpage.
url::https://inla.r-inla-download.org/R/stable/src/contrib/INLA_21.02.23.tar.gz
This solution works, and passes CI as well as the machines are able to install and load from there. The only issue is that I need to periodically update the static link to this tarball, and would prefer to reference the stable build, either directly from the INLA website as above, or the hrue/inla repo with those other arguments passed. Directly referencing those links also has the advantage that when my package is re-installed on a machine, it would recognize whether or not the latest version of INLA has been installed on that machine. Is there a way to achieve this in the DESCRIPTION file?
This is not a perfect answer but maybe what you can do is add the zip url of the stable branch of INLA from the new github repository of INLA:-
url::https://github.com/hrue/r-inla/archive/refs/heads/stable.zip
Hence, this will always install the latest stable version of the package.

Make CRAN R package suggest GitHub R package

I want to use the R package BOLTSSIRR available on GitHub in my R package, which I want to upload to CRAN.
I listed BOLTSSIRR under Suggests: in the DESCRIPTION file and made the link to GitHub available using Additional_repositories: https://github.com/daviddaigithub/BOLTSSIRR.
However, running R CMD check --as-cran I get:
Suggests or Enhances not in mainstream repositories:
BOLTSSIRR
Availability using Additional_repositories specification:
BOLTSSIRR no ?
? ? https://github.com/daviddaigithub/BOLTSSIRR
Additional repositories with no packages:
https://github.com/daviddaigithub/BOLTSSIRR
So the GitHub link does not seem to get recognized in the check. Might I have to change something here?
As you found, you can't use Remotes in a CRAN package. What you need to do is to make sure the .tar.gz file for the package you are depending on is available somewhere. Github doesn't do that automatically, because https://github.com/daviddaigithub/BOLTSSIRR isn't set up as a package repository.
The solution is to create your own small repository, and keep copies of non-CRAN packages there. The drat package (available here: https://github.com/eddelbuettel/drat) makes this easy as long as you have a Github account: follow the instructions here: https://github.com/drat-base/drat. In summary:
Fork https://github.com/drat-base/drat into your account, and clone it to your own computer.
Enable Github Pages with the docs/ folder in the main branch.
Install the drat package into R using remotes::install_github("eddelbuettel/drat"). (I assume this version will make it to CRAN eventually; if you use the current CRAN version instructions are slightly more complicated.)
Build the package you want to insert. You need the source version; you might want binaries too, if those are hard for your users to build.
Run options(dratBranch="docs"); drat::insertPackage(...) to insert those files into your repository.
Commit the changes, and push them to Github.
In the package that needs to use this non-CRAN package, add
Additional_repositories: https://yourname.github.io/drat
to the DESCRIPTION.
You will be responsible for updating your repository if BOLTSSIRR is updated. This is good because the updates might break yours: after all, it's still in development mode. It's also bad because your users won't automatically get bug fixes.
That's it, if I haven't missed anything!

How do i keep source files when using R's devtools library function 'install'

I am trying to build an R package (DESeq2) from source so that I can debug it. I've installed all the dependencies required and I'm following Hillary Parker's instructions for creating R packages. I'm running this on CentOS 6.6 using R-3.4.2.
I run :
library("devtools")
install("DESeq2", keep_source=TRUE)
It installs it in the directory with all my other R libraries. When I look at the installed DESeq2 library it is missing all the DESeq2/R/*.R and DESeq2/src/*.cpp files.
QUESTION : Where are these files and why didn't they get installed? This does not seem like the expected behavior.
R uses binary database format for installed packages to pack the objects into a database-alike file format for efficiency reasons (lazy loading). These database files (*.rdb and *.rdx) are stored in the R sub folder of the package installation path (see ?lazyLoad).
Even if
you are looking at the right place to find the installed package (use .libPaths() in R to find the installation folder)
and you have installed the package with the source code (like you did or
via install.packages("a_CRAN_package", INSTALL_opts = "--with-keep.source"))
you will not find R files in R folder there.
You can verify that the source code is available by picking one function name from the package and print it on the console. If you can see the source code (with comments) the package sources (R files) are available:
print(DeSeq2::any_function)
To make the source code available for debugging and stack traces you can set the option keep.source.pkgs = TRUE (see ?options) in your .Rprofile file or via an environment variable:
keep.source.pkgs:
As for keep.source, used only when packages are
installed. Defaults to FALSE unless the environment variable
R_KEEP_PKG_SOURCE is set to yes.
Note: The source code is available then only for newly installed and updated packages (not for already installed packages!).
For more details see: https://yetanothermathprogrammingconsultant.blogspot.de/2016/02/r-lazy-load-db-files.html

Include non-CRAN package in CRAN package

The question is pretty simple. First:
Is it possible to include a non-CRAN (or bioconductor, or omega hat) package in a CRAN package and actually use tools from that package in examples.
If yes how does one set up the DESCRIPTION file etc. to make it legit and pass CRAN checks?
Specifically I'm asking about openNLPmodels.en that used to be a CRAN package. It's pretty useful and want to include functionality from it. I could do a work around and not actual use openNLPmodels.en in the examples or create unit tests for it, and have it install when a function gets use (similar to how the gender package installs the data sets it needs) but I'd prefer an approach that allows me to run checks, texts, examples.
This is how one downloads and installs openNLPmodels.en
install.packages(
"http://datacube.wu.ac.at/src/contrib/openNLPmodels.en_1.5-1.tar.gz",
repos=NULL,
type="source"
)
Existing answer is good but doesn't explain the whole process fully in details so posting this one.
Is it possible to include a non-CRAN (or bioconductor, or omega hat) package in a CRAN package and actually use tools from that package in examples.
Yes, it is possible. Any use (package code, examples, tests, vignettes) of such non-CRAN has to be escaped as any other package in Suggests, ideally using
if (requireNamespace("non.cran.pkg", quietly=TRUE)) {
non.cran.pkg::fun()
} else {
cat("skipping functionality due to missing Suggested dependency")
}
If yes how does one set up the DESCRIPTION file etc. to make it legit and pass CRAN checks?
You need to use Additional_repositories field in DESCRIPTION file. Location provided in that field has to contain expect directory structure, PACKAGES file in appropriate directory, and PACKAGES file has to have non-CRAN package listed.
Now going to your particular example of openNLPmodels.en package.
According to the way how you download and install this package it will not be possible to use it as dependency and pass on CRAN. openNLPmodels.en has to be published in a structure expected from R repository. Otherwise you don't have a valid location to put into Additional_repositories field.
What you can do is to download non-CRAN package and publish it in your R repository yourself, and then use that location in Additional_repositories field in your CRAN package.
Here is an example of how to do it:
dir.create("src/contrib", recursive=TRUE)
download.file("http://datacube.wu.ac.at/src/contrib/openNLPmodels.en_1.5-1.tar.gz", "src/contrib/openNLPmodels.en_1.5-1.tar.gz")
tools::write_PACKAGES("src/contrib")
We just put package sources in expected directory src/contrib and the rest is nicely handled by write_PACKAGES function. To ensure that repository is properly created you can list packages that are available in that repository:
available.packages(repos=file.path("file:/",getwd()))
It should list your non-CRAN package there.
Then having non-CRAN package published in R repository you should location of the repository into Additional_repositories field of your CRAN package. In this case location will be location returned by file.path("file:/",getwd()) expression.
Note that it uses location on your local machine, you will probably want to put it online, so that url can accessed by any machine checking your CRAN package, including checks on CRAN itself. For that just move your src directory to a public directory that is going to be hosted somewhere online and use the location of that server.
Now looking at your non-CRAN package again, we can see it has src/contrib in its url, thus we can assume that proper R repository already exists for it and we don't have to create and publish new one.
Therefore your installation instruction could look like
install.packages(
"openNLPmodels.en",
repos="http://datacube.wu.ac.at",
type="source"
)
And then all you need for your CRAN package is to use existing repository where it is available
Additional_repositories http://datacube.wu.ac.at
Its possible, but! ...
There is a field in the DESCRIPTION file that that you can use:
Additional_repositories: http://ghrr.github.io/drat
BUT!
Everything that depends on the functionality from the package from the additional repository has to be absolutely optional.
So packages from this repo should be placed under Suggests.
Example
I am not 100% sure whether or not BioConductor and OmegaHat are considered mainstream or not.
The usethis::use_dev_package function has solved this problem.
As an example, running this line:
usethis::use_dev_package(package = "h3", type = "Imports", remote = "crazycapivara/h3-r")
will automatically write the following lines to your DESCRIPTION file:
Imports:
h3 (>= 3.7.1)
Remotes:
crazycapivara/h3-r
Note that because github is the most commonly-used unofficial package distribution in R, it is the default. As such, make sure there is no github:: prefix to the entry in the Remotes section of the DESCRIPTION file.

Resources