How to create CRAN ready R package that has external dependency libxml2 - r

I have created an R package that I would like to submit to CRAN. It contains code that needs to be compiled in plain C and this code depends on the libxml2 library.
My current solution is to let Linux and Mac users install the libxml2-dev package, which lets them compile and install the R source package.
For Windows, I have created a special binary R-package that contains the required binary dependency. When reading the CRAN guidelines I see that only source packages may be uploaded and that they may not contain any binary files.
After those guidelines, my questions are:
Is it ok for Mac/Linux to have the user install libxml2-dev prior to installing the R package or are there alternative solutions?
How should I do for Windows where the libxml2 is not straight forward to install for an end user?

As mentioned above, you can just copy over what the xml2 package does:
To get things to work on Linux/MacOS, copy the files configure and /src/Makevars.in. Note that macOS includes a copy of libxml2 by default, so you can safely link to -lxml2 as you would do on Linux.
For Windows need to copy the files src/Makevars.win and tools/winlibs.R from xml2. This is a simple script that automatically downloads and statically links libxml2 from rwinlib when building the R package on Windows.
These build scripts are tested to work on (almost) any platform.

Related

R package build failing on Unix machines due to missing GSL - GNU Scientific Library

I am facing a particularly vexing problem with R package development. My own package, called ggstatsplot (https://github.com/IndrajeetPatil/ggstatsplot), depends on userfriendlyscience, which depends on another package called MBESS, which itself ultimately depends on another package called gsl. There is no problem at all for installation of ggstatsplot on a Windows machine (as assessed by AppVeyor continuous integration platform: https://ci.appveyor.com/project/IndrajeetPatil/ggstatsplot).
But whenever the package is to be installed on Unix machines, it throws the error that ggstatsplot can't be downloaded because userfriendlyscience and MBESS can't be downloaded because gsl can't be downloaded. The same thing is also revealed on Travis continuous integration platform with virtual Unix machines, where the package build fails (https://travis-ci.org/IndrajeetPatil/ggstatsplot).
Now one way to solve this problem for the user on the Unix machine is to configure GSL (as described here:
installing R gsl package on Mac), but I can't possibly expect every user of ggstatsplot to go through the arduous process of configuring GSL. I want them to just run install.packages("ggstatsplot") and be done with it.
So I would really appreciate if anyone can offer me any helpful advice as to how I can make my package user's life simpler by removing this problem at its source. Is there something I should include in the package itself that will take care of this on behalf of the user?
This may not have a satisfying solution via changes to your R package (I'm not sure either way). If the gsl package authors (which include a former R Core member) didn't configure it to avoid a pre-req installation of a linux package, there's probably a good reason not to.
But it may be some consolation that most R+Linux users understand that some R packages first require installing the underlying Linux libraries (eg, through apt or dnf/yum).
Primary Issue: making it easy for the users to install
Try to be super clear on the GitHub readme and the CRAN INSTALL file. The gsl package has decent CRAN directions. This leads to the following bash code:
sudo apt-get install libgsl0-dev
The best example of clear (linux pre-req package) documentation I've seen is from the curl and sf packages. sf's CRAN page lists only the human names of the 3 libraries, but the GitHub page provides the exact bash commands for three major distribution branches. The curl package does this very well too (eg, CRAN and GitHub). For example, it provides the following explanation and bash code:
Installation from source on Linux requires libcurl. On Debian or Ubuntu use libcurl4-openssl-dev:
sudo apt-get install -y libcurl-dev
Ideally your documentation would describe how do install the gsl linux package on multiple distributions.
Disclaimer: I've never developed a package that directly requires a Linux package, but I use them a lot. In case more examples would help, this doc includes a script I use to install stuff on new Ubuntu machines. Some commands were stated explicitly in the package documentation; some had little or no documentation, and required research.
edit 2018-04-07:
I encountered my new favorite example: the sys package uses a config file to produce the following message in the R console. While installing 100+ packages on a new computer, it was nice to see this direct message, and not have to track down the R package and the documentation about its dependencies.
On Debian/Ubuntu this package requires AppArmor.
Please run: sudo apt-get install libapparmor-dev
Another good one is pdftools, that also uses a config file (and is also developed by Jeroen Ooms).
Secondary Issue: installing on Travis
The userfriendly travis config file apparently installs a lot of binaries directly (including gsl), unlike the current ggstatsplot version.
Alternatively, I'm more familiar with telling travis to install the linux package, as demonstrated by curl's config file. As a bonus, this probably more closely replicates what typical users do on their own machines.
addons:
apt:
packages:
- libcurl4-openssl-dev
Follow up 2018-03-13 Indrajeet and I tweaked the travis file so it's working. Two sections were changed in the yaml file:
The libgsl0-dev entry was added under the packages section (similar to the libcurl4-openssl-dev entry above).
Packages were listed in the r_binary_packages section so they install as binaries. The build was timing out after 50 minutes, and now it's under 10 min. In this particular package, the r_binary_packages section was nested in the Linux part of the Travis matrix so it wouldn't interfere with his two OS X jobs on Travis.

what is the difference of install r package in tow commands?

When I install the packages in R, sometimes it is used by devtools::install_github(). other times it is used by install.packages().
Could I ask what is the essential difference between them?
R's official repository for packages is located on CRAN (Comprehensive R Archive Network). The process of publishing a package there is very strict and is reachable via install.packages(). For the most part, binary packages (opposed to source code, which is not "properly translated" yet) are available and no additional tools need to be present for proper installation (see next paragraph).
GitHub is one of many webservices that offers repositories for code, incl. R code. Author can upload her or his package and if everything is in its place, the user can install a package from source via devtools::install_github(). This means you need to have a proper toolchain installed (also a distributoin of LaTeX). In Windows, this means Rtools. Linux based OS are likely to be shipped with most of the necessary tools.

Can we install a `.zip` R package under linux?

I have found an old R package with a .zip extension on my PC.
I would like to run it, but I do not have the tar.gz that was used to
create it and I use linux. What are my options?
Few, essentially.
A .zip package for R is almost surely a binary built for Windows so you need to find a suitable Windows computer -- or emulator -- to use it.
So, this can be done this way:
install wine (wine is not an emulator),
install R for Windows, which you download manually from CRAN
install the zip package using the usual commands (install.packages("filename.zip",source=NULL)). You will probably get error messages for the dependencies, but incrementally installing those, it should work.

installing package from R-forge fails [duplicate]

This, question, is, asked, over, and, over, and, over,
on the R-sig-finance mailing list, but I do not think it has been asked on stackoverflow.
It goes like this:
Where can I obtain the latest version of package XYZ that is hosted on R-forge? I tried to install it with install.packages, but this is what happened:
> install.packages("XYZ",repos="http://r-forge.r-project.org")
Warning message: package ‘XYZ’ is not available (for R version 2.15.0)
Looking on the R-forge website for XYZ, I see that the package failed to build.
Therefore, there is no link to download the source. Is there any other way
to get the source code? Once I get the source code, how can I turn that into a
package that I can load with library("XYZ")?
R-Forge may fail to build a package for a few different reasons. It could be that
the documentation has not been updated to reflect recent changes in the code. Or,
it could be that some of the dependencies were not available at build time.
You can checkout the source code using svn. First, search for the project on the
R-Forge website and go to the project home page -- for example http://r-forge.r-project.org/projects/returnanalytics/
Click the SCM link to get to a page like this http://r-forge.r-project.org/scm/?group_id=579
This page will tell you the command to use to checkout the project. In this case you get
This project's SVN repository can be checked out through anonymous access with the following command(s).
svn checkout svn://svn.r-forge.r-project.org/svnroot/returnanalytics/
If you are on Windows, you probably want to download and install TortoiseSVN
Once you have installed TortoiseSVN, you can right click in a Windows Explorer window and select
"SVN checkout". In the "URL of repository:" field, enter everything except the
"svn checkout " part of the command that you found on R-Forge. In this case, you'd
enter "svn://svn.r-forge.r-project.org/svnroot/returnanalytics/".
When you click OK, the project will be downloaded into the current directory.
If you are on a UNIX-alike system (or if you installed the command line client tools
when you installed TortoiseSVN for Windows, which is not the default), you can
type the command that R-forge gave you in your terminal (System terminal, not the R terminal)
svn checkout svn://svn.r-forge.r-project.org/svnroot/returnanalytics/
That will create a new directory under the current working directory that
contains all of the files in the package. In the top level of that directory
will be a subdirectory called "pkg". This particular project (returnanalytics)
contains more than one package.
ls returnanalytics/pkg
#FactorAnalytics MPO PApages PerformanceAnalytics PortfolioAnalytics
But some R-forge projects only have a single package. e.g.
svn checkout svn://svn.r-forge.r-project.org/svnroot/random/
#Checked out revision 14.
ls random/pkg
#DESCRIPTION inst man NAMESPACE R
Now that you have a local copy all of the code, if you would like to be able to
install the package, you have to build it first.
A WORD OF CAUTION: Since R-Forge failed to build the package, there is a good chance
that there are problems with the package. Therefore, if you just build it, you may find
that some things do not work as expected. In particular, it is likely that there
is missing or incomplete documentation.
If you are on a UNIX-alike system, the package can be built and installed relatively easily. For a multi-package project like returnanalytics, if you want to install, e.g. the
PortfolioAnalytics package, you can do it like this
R --vanilla CMD INSTALL --build returnanalytics/pkg/PortfolioAnalytics
"PortfolioAnalytics" is the name of the directory that contains the package that
you want to build/install. For a single-package project, you can build and install like
this
R --vanilla CMD INSTALL --build random/pkg
If you would like to build/install a package on Windows, see this question and follow the two links that #JoshuaUlrich provided
More information can be found in R Installation and Administration, the R-Forge User Manual, and the SVN manual.
If (and only if) you have the appropriate toolchain for your OS, then this may succeed:
# First download source file to your working directory
# As an example use browser to download pkg:partykit from:
# http://download.r-forge.r-project.org/src/contrib/partykit_1.1-2.tar.gz
# Move to working directory
# Or in the case of returnanalytics (which is a bundle of packages):
# http://r-forge.r-project.org/R/?group_id=579 and download the tar.gz (source)
# Then in R:
install.packages( "partykit_1.1-2.tar.gz", repo=NULL, type="source")
# for the first of the ReturnAnalytics packages:
install.packages( "Dowd_0.11.tar.gz", repo=NULL, type="source")
These direction should be "cross-platform". I'm not sure the directions in the accepted answer are applicable to Macs (OSX). (I later confirmed that they do "work" on a Mac but found the process more involved that what I suggested above. They do result in a directory that do contain the packages in a form that should succeed with R --vanilla CMD INSTALL --build pathToEachPackageSeparately)
It is also possible that the current version of the package you are trying to install requires a newer version of R, for example, you may see error like:
"ERROR: this R is version 2.15.0, package 'PerformanceAnalytics' requires R >= 3.0.0"
then you can try to update your R
or, if you are facing the same situation with me, which is trying to use pqR (currently using R version 2.15), you can find the out-of-date achieved package here:
http://cran.at.r-project.org/src/contrib/Archive/PerformanceAnalytics/
You can get here from R-Forge packages page -> "Stable Release: Get PerformanceAnalytics 1.4.3541 from CRAN" -> Old sources: PerformanceAnalytics archive
for example, you will find package PerformanceAnalytics version 1.1.0 just requires R >= 2.14
Good luck
Alternatively, you can install the particular package from GitHub, if it has a repo at GitHub.
I ran install.packages('ggfortify'), and got
Warning message: “package ‘ggfortify’ is not available (for R version
3.3.2)”
ggfortify was the GitHub repo for the same package.
The devtools library allows you to install a package from GitHub directly with install_github('username/repo').
library(devtools)
install_github('sinhrks/ggfortify')

Cannot install R-forge package using install.packages

This, question, is, asked, over, and, over, and, over,
on the R-sig-finance mailing list, but I do not think it has been asked on stackoverflow.
It goes like this:
Where can I obtain the latest version of package XYZ that is hosted on R-forge? I tried to install it with install.packages, but this is what happened:
> install.packages("XYZ",repos="http://r-forge.r-project.org")
Warning message: package ‘XYZ’ is not available (for R version 2.15.0)
Looking on the R-forge website for XYZ, I see that the package failed to build.
Therefore, there is no link to download the source. Is there any other way
to get the source code? Once I get the source code, how can I turn that into a
package that I can load with library("XYZ")?
R-Forge may fail to build a package for a few different reasons. It could be that
the documentation has not been updated to reflect recent changes in the code. Or,
it could be that some of the dependencies were not available at build time.
You can checkout the source code using svn. First, search for the project on the
R-Forge website and go to the project home page -- for example http://r-forge.r-project.org/projects/returnanalytics/
Click the SCM link to get to a page like this http://r-forge.r-project.org/scm/?group_id=579
This page will tell you the command to use to checkout the project. In this case you get
This project's SVN repository can be checked out through anonymous access with the following command(s).
svn checkout svn://svn.r-forge.r-project.org/svnroot/returnanalytics/
If you are on Windows, you probably want to download and install TortoiseSVN
Once you have installed TortoiseSVN, you can right click in a Windows Explorer window and select
"SVN checkout". In the "URL of repository:" field, enter everything except the
"svn checkout " part of the command that you found on R-Forge. In this case, you'd
enter "svn://svn.r-forge.r-project.org/svnroot/returnanalytics/".
When you click OK, the project will be downloaded into the current directory.
If you are on a UNIX-alike system (or if you installed the command line client tools
when you installed TortoiseSVN for Windows, which is not the default), you can
type the command that R-forge gave you in your terminal (System terminal, not the R terminal)
svn checkout svn://svn.r-forge.r-project.org/svnroot/returnanalytics/
That will create a new directory under the current working directory that
contains all of the files in the package. In the top level of that directory
will be a subdirectory called "pkg". This particular project (returnanalytics)
contains more than one package.
ls returnanalytics/pkg
#FactorAnalytics MPO PApages PerformanceAnalytics PortfolioAnalytics
But some R-forge projects only have a single package. e.g.
svn checkout svn://svn.r-forge.r-project.org/svnroot/random/
#Checked out revision 14.
ls random/pkg
#DESCRIPTION inst man NAMESPACE R
Now that you have a local copy all of the code, if you would like to be able to
install the package, you have to build it first.
A WORD OF CAUTION: Since R-Forge failed to build the package, there is a good chance
that there are problems with the package. Therefore, if you just build it, you may find
that some things do not work as expected. In particular, it is likely that there
is missing or incomplete documentation.
If you are on a UNIX-alike system, the package can be built and installed relatively easily. For a multi-package project like returnanalytics, if you want to install, e.g. the
PortfolioAnalytics package, you can do it like this
R --vanilla CMD INSTALL --build returnanalytics/pkg/PortfolioAnalytics
"PortfolioAnalytics" is the name of the directory that contains the package that
you want to build/install. For a single-package project, you can build and install like
this
R --vanilla CMD INSTALL --build random/pkg
If you would like to build/install a package on Windows, see this question and follow the two links that #JoshuaUlrich provided
More information can be found in R Installation and Administration, the R-Forge User Manual, and the SVN manual.
If (and only if) you have the appropriate toolchain for your OS, then this may succeed:
# First download source file to your working directory
# As an example use browser to download pkg:partykit from:
# http://download.r-forge.r-project.org/src/contrib/partykit_1.1-2.tar.gz
# Move to working directory
# Or in the case of returnanalytics (which is a bundle of packages):
# http://r-forge.r-project.org/R/?group_id=579 and download the tar.gz (source)
# Then in R:
install.packages( "partykit_1.1-2.tar.gz", repo=NULL, type="source")
# for the first of the ReturnAnalytics packages:
install.packages( "Dowd_0.11.tar.gz", repo=NULL, type="source")
These direction should be "cross-platform". I'm not sure the directions in the accepted answer are applicable to Macs (OSX). (I later confirmed that they do "work" on a Mac but found the process more involved that what I suggested above. They do result in a directory that do contain the packages in a form that should succeed with R --vanilla CMD INSTALL --build pathToEachPackageSeparately)
It is also possible that the current version of the package you are trying to install requires a newer version of R, for example, you may see error like:
"ERROR: this R is version 2.15.0, package 'PerformanceAnalytics' requires R >= 3.0.0"
then you can try to update your R
or, if you are facing the same situation with me, which is trying to use pqR (currently using R version 2.15), you can find the out-of-date achieved package here:
http://cran.at.r-project.org/src/contrib/Archive/PerformanceAnalytics/
You can get here from R-Forge packages page -> "Stable Release: Get PerformanceAnalytics 1.4.3541 from CRAN" -> Old sources: PerformanceAnalytics archive
for example, you will find package PerformanceAnalytics version 1.1.0 just requires R >= 2.14
Good luck
Alternatively, you can install the particular package from GitHub, if it has a repo at GitHub.
I ran install.packages('ggfortify'), and got
Warning message: “package ‘ggfortify’ is not available (for R version
3.3.2)”
ggfortify was the GitHub repo for the same package.
The devtools library allows you to install a package from GitHub directly with install_github('username/repo').
library(devtools)
install_github('sinhrks/ggfortify')

Resources