How to install R packages into Azure Machine Learning - r

I have trained a model locally using the R package locfit. I am now trying to run this in Azure Machine Learning.
Most guides/previous questions appear to be in relation to Azure Machine Learning (classic). Although I believe the process outlined in similar posts will be similar (e.g. here, here, I am still unable to get it to work.
I have outlined the steps I have followed below:
Download locfit R package for windows Zip file from here
Put this downloaded Zip file into a new Zip file entitled "locfit_package"
I upload this "locfit_package" zip folder to AML as a dataset (Create Dataset > From Local Files > name: locfit_package dataset type: file > Upload the zip ("locfit_package") > Confirm upload is correct
In the R terminal I then execute the following code:
install.packages("src/locfit_package.zip", lib = ".", repos = NULL, verbose = TRUE)
library(locfit_package, lib.loc=".", verbose=TRUE)
library(locfit)
The following error message is then returned:
system (cmd0): /usr/lib/R/bin/R CMD INSTALL
Warning: invalid package ‘src/locfit_package.zip’
Error: ERROR: no packages specified
Warning message:
In install.packages("src/locfit_package.zip", lib = ".", repos = NULL, : installation of package ‘src/locfit_package.zip’ had non-zero exit status
Error in library(locfit_package, lib.loc = ".", verbose = TRUE) : there is no package called ‘locfit_package’
Execution halted

I just checked on the document, it says:"Excute R Script module does not support installing packages that require native compilation, like qdap package which requires JAVA and drc package which requires C++. This is because this module is executed in a pre-installed environment with non-admin permission. Do not install packages which are pre-built on/for Windows, since the designer modules are running on Ubuntu. To check whether a package is pre-built on windows, you could go to CRAN and search your package, download one binary file according to your OS, and check Built: part in the DESCRIPTION file. Following is an example:
And the sample code:
# R version: 3.5.1
# The script MUST contain a function named azureml_main,
# which is the entry point for this module.
# Note that functions dependent on the X11 library,
# such as "View," are not supported because the X11 library
# is not preinstalled.
# The entry point function MUST have two input arguments.
# If the input port is not connected, the corresponding
# dataframe argument will be null.
# Param<dataframe1>: a R DataFrame
# Param<dataframe2>: a R DataFrame
azureml_main <- function(dataframe1, dataframe2){
print("R script run.")
if(!require(zoo)) install.packages("zoo",repos = "https://cloud.r-project.org")
library(zoo)
# Return datasets as a Named List
return(list(dataset1=dataframe1, dataset2=dataframe2))
}
Could you please check on if your package is on this?
Reference document: https://learn.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/execute-r-script

Related

Loading packages installed from a local .zip file

I am trying to load a self-written package in order to use the functions I created there without having to copy and paste them into each new project. The program is having difficulties to recognize the package though. These are the lines I am inputting and the messages issued by the system.
> .libPaths(c(.libPaths(), "path-to-package"))
> install.packages("somepath/myPackage.zip", repos = NULL, lib = "path-to-package")
> library("myPackage", lib.loc = "path-to-package")
Error in library("myPackage", lib.loc = "path-to-package") :
‘myPackage’ ist keine gültiges installiertes Paket
After issuing "install.packages", new directories and files are displayed in Microsoft Explorer. While other files vanish at the end, the directory named "myPackage" remains.
Would it be easier to try and get R do my bidding and make it recognize myPackage as a package not only when installnig, but also when loading it? Or is it easier to copy and paste the functions from the package when I need them?
I do own the book "R Packages" by Hadley Wickham in its first edition from 2015. Are there any passages I might want to reread to get going?

Programmatically extracting R binary package contents and moving them to library folder?

Using R version 3.6.2, RStudio version 1.2.5033, Windows 10
I am compiling a report in Rmarkdown, and am finished except for some troubleshooting with the packages, specifically, those that need compilation and are delivered as binary format into a temp folder outside my working directory.
I am able to extract and copy these binary package contents individually to my library folder on my machine, but generally assume those reading the report will not have all the packages already installed.
Normally I handle packages requirements as follows:
# Install from github if not present
if (!require(devtools)) {
install.packages("devtools", dependencies = TRUE)
}
# Install from CRAN if not present
if (!require(installr)) {
install.packages("installr", dependencies = TRUE)
}
# Now we can use abbreviated require2 function for remainder
installr::require2(stringi, ask = FALSE, dependencies = TRUE)
.
.
.
What I am having trouble with is that 'stringi' package wants compilation upon running command for a machine without this package, and if I select 'no,' then it downloads as binary to a temporary folder. I then need to extract this manually outside of RStudio, which I don't want others to have to do.
So, is there a way to automate this process? Is there a better approach to what my aim is here that removes the issue altogether?
Thank you in advance for your insights and your patience (this is my first question).

unable to install R library in azure ml

I have been trying to install a machine learning package that I can use in my R script.
I have done placed the tarball of the installer inside a zip file and am doing
install.packages("src/packagename_2.0-3.tar.gz", repos = NULL, type="source")
from within the R script. However, the progress indicator just circles indefinitely, and it's not installed in environment.
How can I install this package?
ada is the package I'm trying to install and ada_2.0-3.tar.gz is the file I'm using.
You cannot use the tarball packages. If you are on windows you need to do the following:
Once you install a package (+ it's dependencies) it will download the packages in a directory
C:\Users\xxxxx\AppData\Local\Temp\some directory
name\downloaded_packages
These will be in a zip format. These are the packages you need.
Or download the windows binaries from cran.
Next you need to put all the needed packages in one total zip-file and upload this to AzureML as a new dataset.
in AzureML load the data package connected to a r-script
install.packages("src/ada.zip", lib = ".", repos = NULL, verbose = TRUE)
library(ada, lib.loc=".", verbose=TRUE)
Be sure to check that all dependent packages are available in Azure. Rpart is available.
For a complete overview, look at this msdn blog explaining it a bit better with some visuals.

Building an R package with a pre-compiled shared library

I am facing a problem with an R package that I am writing and trying to build with a pre-compiled shared library. Let me try to briefly describe the problem:
this package (let's call it mypack) relies on a shared library mylib.dll that is already compiled and that I cannot compile on the fly while building the R package.
the library mylib.dll has been compiled on a x64 machine under Windows and can be loaded in R with dyn.load.
the package contains the required file NAMESPACE, where useDynLib(mylib.dll) is specified. The function .onLoad containing the instruction library.dynam('mylib.dll', pkg, lib) is also specified in a file zzz.R.
the R package mypack is built with Rtools using the usual command Rcmd INSTALL, and I then add a directory libs where I save mylib.dll.
when I try to load the package in R with library(mypack), I get the following error message:
Error: package 'mypack' is not installed for 'arch=x64'
This is puzzling me. Why can the shared library be loaded smoothly in R, but when I build a package using it I am getting this weird error message?
Thank you very much in advance for your help!
That error message comes from this code in library:
if (nzchar(r_arch) && file.exists(file.path(pkgpath,
"libs")) && !file.exists(file.path(pkgpath, "libs",
r_arch)))
stop(gettextf("package %s is not installed for 'arch=%s'",
sQuote(pkgname), r_arch), call. = FALSE, domain = NA)
which is telling me you need a {package}/libs/{arch} folder in your built package (ie the installed directory) with an {arch} that matches your system's arch, as given by r_arch <- .Platform$r_arch
I'm guessing your build has failed to make this correctly. Is there any C code in your source?

error with R CMD check because of package dependency

Background
I am creating a newpackage that depends on oldpackage, and have indicated this dependency in the file newpackage/DESCRIPTION.
Furthermore,
oldpackage is installed in the directory, ~/lib/R
my .Rprofile includes .libPaths("~/lib/R")
hence, I can successfully load oldpackage without specifying the library location, e.g., using the command library(oldpackage) in R
Despite the ability to load the package without having its library specified, R CMD check newpackage gives an error indicating that it can not fine oldpackage:
checking whether the package can be loaded ... ERROR
Loading required package: oldpackage
Error: package 'oldpackage' could not be loaded
In addition: Warning message:
In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = lib.loc) :
there is no package called 'oldpackage'
Execution halted
It looks like this package has a loading problem: see the messages for
details.
Questions:
Why is R unable to find the package?
Can I specify the library location in the DESCRIPTION file?
Regarding question 1), it is both a FAQ and yet somewhat annoying. R CMD check runs in vanilla mode, so it will not find user-level libraries. As I recall, setting R_LIBS="...." in the call helps, so try that.
Regarding question 2), no you cannot give a location in DESCRIPTION. Which makes sense as that file needs to work 'everywhere' whereas your location info is local to your machine.

Resources