unable to install R library in azure ml - r

I have been trying to install a machine learning package that I can use in my R script.
I have done placed the tarball of the installer inside a zip file and am doing
install.packages("src/packagename_2.0-3.tar.gz", repos = NULL, type="source")
from within the R script. However, the progress indicator just circles indefinitely, and it's not installed in environment.
How can I install this package?
ada is the package I'm trying to install and ada_2.0-3.tar.gz is the file I'm using.

You cannot use the tarball packages. If you are on windows you need to do the following:
Once you install a package (+ it's dependencies) it will download the packages in a directory
C:\Users\xxxxx\AppData\Local\Temp\some directory
name\downloaded_packages
These will be in a zip format. These are the packages you need.
Or download the windows binaries from cran.
Next you need to put all the needed packages in one total zip-file and upload this to AzureML as a new dataset.
in AzureML load the data package connected to a r-script
install.packages("src/ada.zip", lib = ".", repos = NULL, verbose = TRUE)
library(ada, lib.loc=".", verbose=TRUE)
Be sure to check that all dependent packages are available in Azure. Rpart is available.
For a complete overview, look at this msdn blog explaining it a bit better with some visuals.

Related

Is there a way to 'install' R packages without running install.packages()?

We are testing how to run R in the cloud in a secure isolated environment that is blocked from CRAN and also cannot use packages.install(). We defined an environment which is based on R essentials Anaconda's bundle, still we would like to be able to customize it on demand with extra packages. Is there a way to be able to simulate packages.install(), e.g. by offline downloading the package, zip it, copy to the secure environment and unzipping it to a specific location in the library folder?
thanks!
You can download the package from CRAN as a zip and then transport it to the isolated PC as a file. For example, here is the link to dplyr on CRAN: https://cran.r-project.org/web/packages/dplyr/index.html
Then use the code below to install the local file:
install.packages("~/Downloads/dplyr_1.0.7.zip", repos = NULL)
On Windows you might require Rtools. At least there was a Warning about it but the package still installed.
For Linux machines, you can build the package from source using the tarball from the same page:
install.packages("~/Downloads/dplyr_1.0.7.tar.gz", repos = NULL, type = "source")
In both cases you need to take care of dependencies yourself as they are not checked while installing through this method (look at the "imports" field on the CRAN website for the package).

How do i keep source files when using R's devtools library function 'install'

I am trying to build an R package (DESeq2) from source so that I can debug it. I've installed all the dependencies required and I'm following Hillary Parker's instructions for creating R packages. I'm running this on CentOS 6.6 using R-3.4.2.
I run :
library("devtools")
install("DESeq2", keep_source=TRUE)
It installs it in the directory with all my other R libraries. When I look at the installed DESeq2 library it is missing all the DESeq2/R/*.R and DESeq2/src/*.cpp files.
QUESTION : Where are these files and why didn't they get installed? This does not seem like the expected behavior.
R uses binary database format for installed packages to pack the objects into a database-alike file format for efficiency reasons (lazy loading). These database files (*.rdb and *.rdx) are stored in the R sub folder of the package installation path (see ?lazyLoad).
Even if
you are looking at the right place to find the installed package (use .libPaths() in R to find the installation folder)
and you have installed the package with the source code (like you did or
via install.packages("a_CRAN_package", INSTALL_opts = "--with-keep.source"))
you will not find R files in R folder there.
You can verify that the source code is available by picking one function name from the package and print it on the console. If you can see the source code (with comments) the package sources (R files) are available:
print(DeSeq2::any_function)
To make the source code available for debugging and stack traces you can set the option keep.source.pkgs = TRUE (see ?options) in your .Rprofile file or via an environment variable:
keep.source.pkgs:
As for keep.source, used only when packages are
installed. Defaults to FALSE unless the environment variable
R_KEEP_PKG_SOURCE is set to yes.
Note: The source code is available then only for newly installed and updated packages (not for already installed packages!).
For more details see: https://yetanothermathprogrammingconsultant.blogspot.de/2016/02/r-lazy-load-db-files.html

How to use R CMD Install without dependencies check?

I'm running R CMD INSTALL --build package on a windows computer. My package imports a couple of other packages which themselves depend on some more packages. I have all dependencies installed in the local r_libs folder and everything works.
Now sometimes I have the my package source code on a different windows computer. On this computer I don't have all the dependency packages installed.
When I try to use R CMD INSTALL --build package, I get the obvious "ERROR: dependencies 'package a', 'package b', etc, are not available for package".
My question is: Can I build the package using R CMD INSTALL --build without the dependency checks and without removing the Import and Depends entries in the DESCRIPTION file?
After consulting --help, I tried the --no-test-load option but no luck.
I reckon you want to build a .zip binary version of the package on a computer where not all dependencies are installed. And I'm afraid I'll have to disappoint you, as this won't be possible.
Building a binary package is done in two steps: first the package is installed from source (that's why you have to use R CMD INSTALL and then the created binaries are zipped in a convenient format for installation on a windows machine. The dependencies are checked at time of installation from source, and any missing dependencies will throw the error you're facing.
As R needs information from the dependencies at time of installation from source, you can't get around installing them before building the whole thing. This also makes sense. An installed package in R contains a set of .rds files which contain package information in a more convenient format for R. In order to create that information for the NAMESPACE file, it needs to be able to access the packages from which functions are imported. If not, it can't construct the correct information about the namespace.
So your only option is to install the dependencies on the computer you use to build. And if you actually want to use the package on that computer, you'll have to install those dependencies anyway.
More information:
R Internals : https://cran.r-project.org/doc/manuals/r-release/R-ints.html#Package-Structure
Writing R Extensions: https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Package-namespaces

How to build R package from GitHub?

I try to build fork of R package from github (this fork has a fresh bugfix). I am able to build AND install the package from github:
require(devtools)
install_github("patcpsc/rredis", build_vignettes = FALSE)
However, this doesn't produce installable package - or does it? I need to install this package on 15 machines so I prefer to build the package once and then copy and install it on the other machines.
I tried to look for funciton like build_github, unfortunatelly there is none. How do I do it?
github has help documentation on how to fork a repository. It sounds like you've done the first part. Now you just need to clone the repository. That means taking a copy for your local machine so you can work on it. The buttons you want are on the right. Clone in desktop is for when you use the Github desktop software. If you are running git from a command line, type
git clone git#github.com:whatever-the-link-is-in-the-SSH-clone-url-textbox
Once you have a local copy of the repository, in R you do
library(devtools)
build("path/to/package/root")
I thought you wanted to actually work on the package. If you just want to download the source, there's a "Download ZIP" button right underneath the clone options. Download, unzip, then build in R as above.
It's old question and a lot changes since 2014. Now the workhorse is remotes package.
If you want installable package there is one created in your temp directory.
I usually don't want install so I create temporary library:
dir.create(tmp_lib <- "tmp_lib")
.libPaths(c(tmp_lib,.libPaths()))
.libPaths()
But you can skip that if not needed, now standard:
require(devtools)
install_github("patcpsc/rredis", build_vignettes = FALSE)
Now navigate to your temp location given by tempdir() (in Windows shortcut is: shell.exec(tempdir())).
You should see folder [fileXXXXXXXX] which should contain rredis_1.6.9.tar.gz file. This is what you need.
unlink(tmp_lib, recursive=TRUE) cleanup your temp directory.

Manually Downloading and Installing Packages in R

I am currently trying to run some R code on a computing cluster but cannot run the install.packages function due to some weird firewall settings on my cluster. Since I am only using a few packages in my R code, I was hoping to avoid using the install.packages function by downloading and installing the packages manually.
Note: I am aware that there is a way to avoid this issue by using an HTTP proxy as described in the R FAQ. Unfortunately the people in charge of my cluster are not being helpful in setting this up so I'm forced to consider this alternative approach.
Ideally, I would like to download the packages files from CRAN to my computer, then upload these files to the cluster and install them using the appropriate commands in R. In addition, I would also like to make sure that the packages are installed to a location of my choice since I do not have the permission to "write" in the default R directory (I believe that I can do this within R by using the .libPaths function)
Lastly, the computers that I am working with on the cluster are Unix x86_64.
You can install the package manually using the following command
install.packages('package.zip', lib='destination_directory',repos = NULL)
See the help of ?install.packages, for further description
I also went through the same problem while installing the caret package. There are many dependencies of the caret package.
So, I did the following:
install.packages('caret'):
This gives all packages in zip format the location of download is shown in the error message. Unzip all packages from download source to a location for example in C:/PublicData/RawRPackages, then run following command.
foldername <- 'C:/PublicData/RawRPackages'
install.packages(paste(foldername, 'caret', sep = '/'),
repos = NULL, type = "source")
library(caret, lib.loc = foldername)
this the better way, if we want to download and install locally :
download.packages('lib_name',destdir='dest_path')
for example :
download.packages('RJDBC',destdir='d:/rlibs')
install.packages("libname",lib = "file://F:/test")

Resources