Publishing AzureML Webservice from R requires external zip utility - r

I want to deploy a basic trained R model as a webservice to AzureML. Similar to what is done here:
http://www.r-bloggers.com/deploying-a-car-price-model-using-r-and-azureml/
Since that post the publishWebService function in the R AzureML package was has changed it now requires me to have a workspace object as first parameter thus my R code looks as follows:
library(MASS)
library(AzureML)
PredictionModel = lm( medv ~ lstat , data = Boston )
PricePredFunktion = function(percent)
{return(predict(PredictionModel, data.frame(lstat =percent)))}
myWsID = "<my Workspace ID>"
myAuth = "<my Authorization code"
ws = workspace(myWsID, myAuth, api_endpoint = "https://studio.azureml.net/", .validate = TRUE)
# publish the R function to AzureML
PricePredService = publishWebService(
ws,
"PricePredFunktion",
"PricePredOnline",
list("lstat" = "float"),
list("mdev" = "float"),
myWsID,
myAuth
)
But every time I execute the code I get the following error:
Error in publishWebService(ws, "PricePredFunktion", "PricePredOnline", :
Requires external zip utility. Please install zip, ensure it's on your path and try again.
I tried installing programs that handle zip files (like 7zip) on my machine as well as calling the utils library in R which allows R to directly interact with zip files. But I couldn't get rid of the error.
I also found the R package code that is throwing the error, it is on line 154 on this page:
https://github.com/RevolutionAnalytics/AzureML/blob/master/R/internal.R
but it didn't help me in figuring out what to do.
Thanks in advance for any Help!

The Azure Machine Learning API requires the payload to be zipped, which is why the package insists on the zip utility being installed. (This is an unfortunate situation, and hopefully we can find a way in future to include a zip with the package.)
It is unlikely that you will ever encounter this situation on Linux, since most (all?) Linux distributions includes a zip utility.
Thus, on Windows, you have to do the following procedure once:
Install a zip utility (RTools has one and this works)
Ensure the zip is on your path
Restart R – this is important, otherwise R will not recognize the changed path
Upon completion, the litmus test is if R can see your zip. To do this, try:
Sys.which("zip")
You should get a result similar to this:
zip
"C:\\Rtools\\R-3.1\\bin\\zip.exe"
In other words, R should recognize the installation path.
On previous occasions when people told me this didn’t work, it was always because they thought they had a zip in the path, but it turned out they didn’t.
One last comment: installing 7zip may not work. The reason is that 7zip contains a utility called 7zip, but R will only look for a utility called zip.

I saw this link earlier but the additional clarification which made my code not work was
1. Address and Path of Rtools was not as straigt forward
2. You need to Reboot R
With regards to the address - always look where it was installed . I also used this code to set the path and ALWAYS ADD ZIP at the end
##Rtools.bin="C:\\Users\\User_2\\R-Portable\\Rtools\\bin"
Rtools.bin="C:\\Rtools\\bin\\zip"
sys.path = Sys.getenv("PATH")
if (Sys.which("zip") == "" ) {
system(paste("setx PATH \"", Rtools.bin, ";", sys.path, "\"", sep = ""))
}
Sys.which("zip")
you should get a return of
" C:\\RTools|\bin\zip"

From looking at Andrie's comment here: https://github.com/RevolutionAnalytics/AzureML/commit/9cf2c5c59f1f82b874dc7fdb1f9439b11ab60f40
Implies we can just download RTools and be done with it.
Download RTools from:
https://cran.r-project.org/bin/windows/Rtools/
During installation select the check box to modify the PATH
At first it didn't work. I then tried R32bit, and that seemed to work. Then R64 bit started working again. Honestly, not sure if I did something in the middle to make it work. Only takes a few minutes so worth a punt.

Try the following
-Download the Rtools file which usually contains the zip utility.
-Copy all the files in the "bin" folder of "Rtools"
-Paste them in "~/RStudio/bin/x64" folder

Related

R cmd check note: unable to verify current time

When running R CMD check I get the following note:
checking for future file timestamps ... NOTE
unable to verify current time
I have seen this discussed here, but I am not sure which files it is checking for timestamps, so I'm not sure which files I should look at. This happens locally on my windows and remotely on different systems (using github actions).
Take a look at https://svn.r-project.org/R/trunk/src/library/tools/R/check.R
The check command relies on an external web resource:
now <- tryCatch({
foo <- suppressWarnings(readLines("http://worldclockapi.com/api/json/utc/now",
warn = FALSE))
This resource http://worldclockapi.com/ is currently not available.
Hence the following happens (see same package source):
if (is.na(now)) {
any <- TRUE
noteLog(Log, "unable to verify current time")
See also references:
https://community.rstudio.com/t/r-devel-r-cmd-check-failing-because-of-time-unable-to-verify-current-time/25589
So, unfortunately this requires a fix in the check function by the R development team ... or the web-resource coming online again.
To add to qasta's answer, you can silence this check by setting the _R_CHECK_SYSTEM_CLOCK_ environment variable to zero e.g Sys.setenv('_R_CHECK_SYSTEM_CLOCK_' = 0)
To silence this in a persistent manner, you can set this environment variable on R startup. One way to do so is through the .Renviron file, in the following manner:
install.packages("usethis") (If not installed already)
usethis::edit_r_environ()
Add _R_CHECK_SYSTEM_CLOCK_=0 to the file
Save, close file, restart R

Alternative to Rtools for openxlsx

I am using R on my work computer inside of our Network. Naturally, I do not have admin rights and It would be hard to convince my IT Department to make an exception for me installing Rtools on my computer.
My main issue with not having Rtools is, that I cannot use the saveWorkbook command from openxlsx, which would allow me to save data as Excel table objects.
The Error of the command implies that I could use an alternative zip application:
Please make sure Rtools is installed or a zip application is available to R
Would this be possible? Our work Computers have 7-zip for instance.
In line with the comment of #Tung and others I copied a Folder of Rtools from my private Computer to my work Computer. I tried the following to no avail
Rtools.bin="C:\\Rtools\\bin"
sys.path = Sys.getenv("PATH")
if (Sys.which("zip") == "" ) {
system(paste("setx PATH \"", Rtools.bin, ";", sys.path, "\"", sep = ""))
}
I also tried using Sys.setenv("R_ZIPCMD" = "C:/Program Files/7-Zip/7zG.exe") to use 7 zip but then I get an error messan Incorrect Switch postfix: -r1
I am specifally trying to replicate the writeDataTable example from openxlsx

Packrat with local binary repository

I want to use packrat on a Windows 7 machine with no internet connection.
I have downloaded all binary packages from http://cran.r-project.org/bin/windows/contrib/3.1/ into the local folder C:/xyz/CRAN_3_1.
The problem is now that
packrat::init(options=list(local.repos="C:/xyz/CRAN_3_1"))
throws a bunch of warnings and errors like
Warning: unable to access index for repository http://cran.rstudio/bin/...
Warning: unable to access index for repository http://cran.rstudio/src/...
Fetching sources for Rcpp (0.11.4) ... Failed
Package Rcpp not available in repository or locally
As it seems packrat tries to find
the binary version of Rcpp on CRAN (fails since there is no internet connection)
the source of Rcpp on CRAN (fails since there is no internet connection)
the local source of the package (fails since I only have the binaries)
What I don't understand is why packrat does not also search for the local binary package...
Question 1: I could download the source CRAN repository to get around this problem. But I would like to know from you guys whether there is an easier solution to this, i.e., whether it is possible to make packrat accept a local binary repo.
Question 2: When I create my own package myPackage with packrat enabled, will the myPackage-specific local packrat library also be included in the package? That is, assume that I give the binary myPackage zip File to one of my colleagues who does not have one of the packages that myPackage depends on (let's say Rcpp). Will Rcpp be included in myPackage when I use packrat? Or does my colleague have to install Rcpp himself?
I managed to hack around this problem. Please bear in mind that I have never used packrat before and that I do not know its "proper" behaviour. But my impression is that the hack works.
Here is how I did it:
Open your project, load packrat via library(packrat)
type fixInNamespace("snapshotImpl",ns="packrat") - a window opens - copy its content into the clipboard
Go to /yourProjDir/ and create a file snapshotImplFix.R
Copy the clipboard's content into this file ...
... but change the first line to
snapshotImplFix=function (project, available = NULL, lib.loc = libDir(project),
dry.run = FALSE, ignore.stale = FALSE, prompt = interactive(),
auto.snapshot = FALSE, verbose = TRUE, fallback.ok = FALSE,
snapshot.sources = FALSE)
Note snapshot.sources = FALSE! Save and close the file.
Create /yourProjDir/.Rprofile and add
setHook(packageEvent("packrat","onLoad"),function(...) {
source("./snapshotImplFix.R");
tmpfun=get("snapshotImpl",envir=asNamespace("packrat"));
environment(snapshotImplFix)=environment(tmpfun);
utils::assignInNamespace(x="snapshotImpl",value=snapshotImplFix,ns="packrat");})
Points 2-6 fix the problem with the snapshot.sources argument being TRUE by default (I did not find a better way to change that...)
Finally, we have to tell packrat to take our local repository. It's important that you have the right folder structure. Therefore I moved the repo from C:/xyz/CRAN_3_1 to C:/xyz/CRAN_3_1/bin/windows/contrib/3.1. Do not forget to run library(tools);write_PACKAGES("C:/xyz/CRAN_3_1/bin/windows/contrib/3.1"); if you also have to move your files.
Open yourProjDir/.Rprofile again and add at the end
local({r=getOption("repos");r["CRAN"]="file:///C:/xyz/CRAN_3_1";r["CRANextra"]=r["CRAN"];options(repos=r)})
Note the 3 / right after file! Save and exit file.
Close the project and re-open.
Now you can execute packrat::init() and it should run without errors.
It would be great if someone with more experience regarding packrat could give his/her input so that I can be sure that this hack works. Any pointers to proper solutions are highly appreciated, of course.

Warning message In download.file: download had nonzero exit status

I am downloading data from data.gov website and I get following two types of errors in the process:
fileUrl <- "http://catalog.data.gov/dataset/expenditures-on-children-by-families"
download.file(fileUrl,destfile=".data/studentdata.csv",method="curl")
Warning message:
In download.file(fileUrl, destfile = ".data/studentdata.csv", method = "curl") :
download had nonzero exit status
I tried to remove the method="curl" as suggested in other forum, but again I get this new error
download.file(fileUrl,destfile=".data/studentdata.csv")
Error in download.file(fileUrl, destfile = ".data/studentdata.csv") :
cannot open destfile '.data/studentdata.csv', reason 'No such file or directory'
I think there are two major factors why your curl doesn't work well.
First, the problem is on your URL. fileUrl <- "http://catalog.data.gov/dataset/expenditures-on-children-by-families". In your URL, it is not referred to a csv file. So, they won't work even if you set the destination into a csv file such as destfile = ".data/studentdata.csv"
I have an example of getting a csv dataset using the same code (different dataset):
DataURL<- "https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD" (This link refers to a rows.csv file)
download.file(DataURL, destfile="./data/rows.csv", method="curl") (The method is quite same, using curl)
Second, previously I had the same problem that the curl does not work, even I used a proper URL that refers to a csv file. However, when I diagnosed a bit deeper, I found something interesting fact about why my curl method cannot work properly. It was my R session program. I used a 32-bit R, in which the error occurs. Later then, I tried to change the session into a 64-bit R. Amazingly, and the download status was running at that time. To see your R session architecture (whether you are using 32-bit or 64-bit), type in your R:
sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-ming32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
You have to switch your R, from 32-bit to 64-bit to avoid 'curl' call had nonzero exit status. You go to your R directory folder, and then you run a 64-bit R.
If you are using a Windows OS and installing the R in a default path folder, you can run this C:\Program Files\R\R-3.5.3\bin\x64\R.exe. (I used a version of 3.5.3, so it may be different with your version)
If you are using R-studio, you can switch the R session on the menubar Tools -> Global Options -> R version -> Change -> Use your machine's default version of R64 (64-bit) -> OK. Then restart your R-studio.
However, it depends on your OS architecture. If you are using a 32-bit OS, hence you have to find another way to solve this.
So looking at the code for download.file(...), if you specify method="curl" the function tries to use the curl shell command. If this command does not exist on your system, you will get the error above.
If you do not specify a method, the default is to use an internal R method to download, which evidently works on your system. In that case, the function is trying to put the file in .data/studentdata.csv but evidently there is not .data directory. Try taking out the ..
When this download works, you will get a text/html file, not a csv file. Your url points to a web page, not a download link. That page does have a download link, but unfortunately it is a pdf, not a csv.
Finally, if your goal is to have the data in R (is it?), and if the link actually produces a csv file, you could more easily use
df <- read.csv(fileUrl)
If I'm not very much mistaken you just have a simple typo here. I suspect you have a "data" directory, not a ".data" directory - in which case your only problem is that your destfile string needs to begin "./data", not ".data".
I was having the same problem.
Then I realized that I forget to create the "data" directory!
So try adding this above your fileURL line to create the directory first.
if(!file.exists("data")){
dir.create("data")
}
Also, if you are running a Mac, then you want to keep method="curl" when downloading a https file. I don't believe Windows has that problem hence the suggestions to remove it.
Try this:
file<-'http://catalog.data.gov/dataset/expenditures-on-children-by-families'
file<- read.csv(file)

R function zip(utils) doesn't work on my machine

I would like to use the zip(utils) function, unzip function works fine, however, call zip e.g. zip ("out", "file.txt") does not return the file ("out.zip") to the working directory and R does not return any message at all - any error message and no massage about successful compression - if compression is successful R returns info such as "deflate - 40%." I checked various R versions and files - zip still doesn't work. On another computers, everything works fine. I have Windows XP. Furthermore, when I type the wrong name of the zip file to compress e.g. zip("out", "this_file_doesnt_exist.txt") R does not return any error at all ! I don't know much about computers, what should I check ? what could be problem in my case ? I turned off antivirus that didn't help.
You can use gzip from the package R.utils
library(R.utils)
df <- data.frame(x=10)
write.csv(df, file="x1.csv")
gzip("x1.csv")
> dir()
[1] "x1.csv.gz"
For those reading this on 2020:
Install Rtools (https://cran.r-project.org/bin/windows/Rtools/) and follow the documentation steps.
Then Zip command will wokr on Windows 10 machines.
For Windows 10 Users: While the abovementioned Rtools approach did not work for me in the context of zip() {utils}, package {zip} (package description) works fine without installing Rtools and without the need to change your existing code (as the {zip} package masks the functions zip() and unzip() of {utils}).

Resources